Blog | Cybersecurity & IT
January 25, 2024

Finding security threats with DataBee from Comcast Technology Solutions

Last week, DataBee announced the general availability of DataBee v2.0. Alongside a new strategic technology partnership with Databricks, we released new cybersecurity and Payment Card Industry Data Security Standard (PCI DSS) 4.0 reporting capabilities.

In this blog, we’ll dive into the new security threat use cases that you can unlock with a security, risk, and compliance data fabric platform.

DataBee for security practitioners and analysts

In security operations, detecting incidents in a security information and event management (SIEM) tool is often described as looking for a needle in a haystack of logs. Another fun (or not-so-fun) SIEM metaphor is a leaky bucket.

In an ideal world, all security events and logs would be ingested, parsed, normalized, and enriched into the SIEM, and then the events would be cross-correlated using advanced analytics. Basically, logs stream into your bucket and the SIEM, and all the breaches would be detected.

In reality, there are holes in the bucket that allow for undetected breaches to persist. SIEMs can be difficult to manage and maintain. Organization-level customizations, combined with unique and ever-changing vendor formats, can lead to detection gaps between tools and missed opportunities to avert incidents. Additionally, for cost-conscious organizations, there are often trade-offs for high-volume sources that leave analysts unable to tap into valuable insights. All these small holes add up.


What if we could make the security value of data more accessible and understandable to security professionals of all levels? DataBee makes security data a shared language. As a cloud-native security, risk, and compliance data fabric platform, DataBee engages early in the data pipeline to ingest data from any source, then enriches, correlates, and normalizes it to an extended version of the Open Cybersecurity Schema Framework (OCSF) to your data lake for long-term storage and your SIEM for advanced analytics.

Revisiting the haystack metaphor, if hay can be removed from the stack, a SIEM will be more efficient and effective at finding needles. With DataBee, enterprises can efficiently divert data, the “hay,” from an often otherwise cost-prohibitive and overwhelmed SIEM. This enables enterprises to manage costs and improve the performance of mission-critical analytics in the SIEM. DataBee uses active detection streams to complement the SIEM, identifying threats through vendor-agnostic Sigma rules and detections. Detections are streamed with necessary business context to a SIEM, SOAR, or data lake. DataBee takes to market a platform inspired by security analysts to tackle use cases that large enterprises have long struggled with, such as:

  • SIEM cost optimization
  • Standardized detection coverage
  • Operationalizing security findings

SIEM cost optimization

Active detection streams from DataBee provide an easy-to-deploy solution that enables security teams to send their “needles” to their SIEM and their “hay” to a more cost-effective data lake. Data that would often otherwise be discarded can now be analyzed enroute. Enterprises need only retain the active detection stream findings and security logs needed for advanced analytics and reporting in the SIEM. By removing the “hay,” enterprises can reduce their SIEM operating costs.

SIEM Optimization workflow

The optimized cloud architecture enables security organizations to gain insights into logs that are too high volume or contain limited context to leverage in the SIEM. For example, DNS logs are often considered too verbose to store in the SIEM. They contain a high volume of low-value logs due to limited information retained in each event. The limited information makes the DNS logs difficult to cross-correlate with the disparate data sources needed to validate a security incident.

Another great log source example is Windows Event Logs. There are hundreds of validated open-source Sigma detections for Windows Event Logs to identify all kinds of malicious and suspicious behavior. Leveraging these detections has traditionally been difficult due to the scale required both for the number of detections and volume of data to compare it to. With DataBee’s cloud-native active detection streams, the analytics are applied as the data is normalized and enriched, allowing security teams new insights into the potential risks facing their organization. DataBee’s power and scale complement the SIEM’s capabilities, plugging some of the holes in our leaky bucket.

Analyst fatigue can be lessened by suppressing security findings for users or devices that can reduce reliability of a finding. With DataBee’s suppression capability, you can filter and take actions on security findings based on the situation. Selecting “Drop” for the action ignores the event, which is ideal for events that are known to be false positive in the organization. Alternatively, applying an “Informational” action reduces the severity and risk level of the finding to Info, still allowing the finding to be tracked for historical purposes. The Informational level is perfect for tuning that requires auditability long term. The scheduling option uses an innovative approach that gives you a way to account for recurring known events like change windows that might fire alerts or additional issues that could lead to false positives. 

Suppression rules demo view

By applying the analytics and tuning to the enriched logs as they are streamed to more cost-effective long-term storage in the data lake, security teams can detect malicious behavior like PowerShell activity or DNS tunneling. Additionally, DataBee’s Entity Resolution not only enriches the logs but learns more about your organization from them, discovering assets that may be untracked or unknown in your network.

Standardized detection coverage

With the ever-evolving threat landscape, detection content is constantly updated to stay relevant. As such, security organizations have taken on more of a key role in content management between solutions. Compounded by the popularization of Sigma-formatted detections with both security researchers and vendors, many large enterprises are beginning their journey to migrate existing custom detections to open-source formats managed via GitHub. Sigma detection rules are imported and managed via GitHub to DataBee to quickly operationalize detection content. Security organizations can centralize and standardize content management for all security solutions, not just DataBee.

Active detection streams apply Sigma rules, an open-source signature format, over security data that is mapped to a DataBee-extended version of OCSF to integrate into the existing security ecosystem with minimal customizations. DataBee handles the translation from Sigma to OCSF to help lower the level of effort needed to adopt and support organizations on their journey to vendor-agnostic security operations. With Sigma-formatted detections leveraging OCSF in DataBee, organizations can swap out security vendors without needing to update log parsers or security detection content.

Operationalizing security findings

One of DataBee’s core principles is to meet you where you are with your data. The intent is to integrate into your existing workflows and tools and avoid amplifying the “swivel chair” effect that plagues every security analyst. In keeping with the vendor-agnostic approach, DataBee security findings generated by active detection streams can be output in OCSF format to S3 buckets. This format can be configured for ingestion to immediate use in major SIEM providers.

DataBee Architecture

Leveraging active detection streams with Entity Resolution in DataBee enables organizations to identify threats with vendor-agnostic detections with all the necessary business context as the data streams toward its destination. DataBee used in conjunction with the SIEM allows security teams visibility out of the box into potential risks facing their organization without the noise.

Author Information