Resources / The Evolution Blog
Fast, Feature Rich and Mutable: Clickhouse Powers Darwinium's Security and Fraud Analytics Use Cases
Ananth Gundabattula
Building a Database Capable of Making Risk Decision at the Edge
Challenges for the Security and Fraud Domain
Dealing with Cyber Security and Fraud domains, Darwinium needed capabilities to:
Ingest and process data at a high throughput:
- Having a database backend that can handle high throughput (and fast) writes is a fundamental requirement. Additionally, having the capability to serve this data for analysis as soon as the data is ingested is an expectation from digital driven workloads of today and the future.
Deal with large volumes of data:
- Darwinium is built from the ground up on the construct of a journey; wherein there is a continuous monitoring of the digital asset. This results in large volumes of data because the Darwinium real time engine needs to continuously profile and monitor a digital asset. The database needs to be capable of analysing data at scale. Additionally, question of scale also arises from the need that an entire year's worth of data may need to be processed. As an example, a year's worth of data needs to be queried to look at the behaviour of a single credit card. Another use case could be for an account administrator to analyse years worth of data to understand the outliers of all logins to the website.
- Technical types of fraud and security challenges, including malware, remote desktop and bots are highly forensic in nature, requiring storing most digital datapoints that are available on a journey step for future investigations that only then pinpoint the patterns that help to detect that threat.
- Darwinium emphasises the importance of continuously profiling and monitoring of digital journeys for comprehensive view of user intent. That results in a many-one relationship between the records of storage needed for every journey completed. Additionally, some types of fraud such as account takeover, scams and social engineering require long timeframe periods of ‘normal’ behavioural data to compare against the interactions now to detect changes. Storing multi-step journey data and long timeframe data means increase in number of records and storage needed.
- The result is lots of data: both from the amount of data stored per record, many to one relationship of records stored to journey conducted and long timeframes of lookback needed for investigations.
Have capabilities to analyse data in a complex way:
- The nature of analysing fraudulent data requires complex interactive analysis. Having a feature rich, analytical capabilities stack makes a lot of difference; thus translating to reduced human costs. Having a database system that can respond in timeframes of 1 second or less, and at the same time provide a feature rich functional toolbox is a really compelling use case that modern database systems need to support.
To read the full analysis on Darwinium's database engine of choice and it's capabilities, visit the Clickhouse Blog.