Why we built Teleskope
As we wrap up our trip to RSA 2024, it’s clearer than ever that the data security space is flooded with vendors listing dozens of acronyms, trying to be everything for every customer. There is so much posturing and marketing that it can be hard to cut through the noise to figure out who is truly innovating and building products that solve real problems and deliver real value. So we thought it would be the perfect time to introduce everyone to Teleskope. We’re creating a different kind of data protection platform, one built by security and engineering practitioners, for security and engineering practitioners, and we’re excited to share our vision with you all.
Our Story
Julie and I first met while working together on the Airbnb security team, where we faced firsthand the challenges associated with collecting massive amounts of data, from both security and privacy perspectives. I was on the data security team, responsible for protecting our customers' data and complying with privacy regulations like GDPR and CCPA.
The Data Maximization Problem
Many organizations struggle to protect their data. As companies amass petabytes of data in pursuit of unlocking business value, data ends up sprawling across their entire data ecosystem, spanning internal data stores & warehouses and third party vendors. This data maximization poses both security and business concerns. As valuable data gets buried amidst billions of files that don’t really mean anything for the business, companies lose their ability to make use of the data they’ve collected. This also increases risk as it becomes impossible to pinpoint where sensitive data lives, where it’s going, who has access to it, and whether it’s protected or at risk. If you don’t know where sensitive data lives, how can you protect it?
The Manual Approach Problem
At Airbnb, we were fortunate to have a large security team, mostly comprised of software engineers. This allowed us to build our internal data security platform from scratch. But most companies don't have this luxury, and frankly, building an internal data security platform is probably the wrong approach for 99% of companies out there. Data classification is a complex problem, and building it right requires years of work, constant updates, and regular retraining, which ends up being far more expensive than purchasing a tool. This is why most companies don’t even bother developing tools, and instead choose to tackle the data sprawl problem manually. Classifications are labeled by the dataset owner, or a centralized data governance team. Data redaction occurs through manual scripting if a leak happens to be discovered, and data subject rights requests are handled manually via a SQL script that often hasn't been updated in years. The problem with this approach is that it doesn’t scale (across your different structured data stores, data warehouses, unstructured data, and third parties), it's point-in-time, and it’s prone to human error, leaving significant gaps in your security posture. And for companies that are late to the game and already have massive amounts of unlabeled data, this can be a daunting and expensive effort.
The Data Classification Problem
There are tens, if not hundreds of products that claim to classify data accurately, quickly, and automatically. It might seem that data classification has been commoditized, and something you can get from an existing vendor, from your cloud provider. But most of those tools fail miserably when applied to production and messy data. From our own experience, the majority of classifications generated end up being false positives, adding more burden on teams having to sift through the results rather than manually classifying the data directly. Classification is nuanced and complicated because data is nuanced and complicated. Tabular data, log files, code, legal documents, conversational data, etc. are all inherently different from one another, and classification engines need to effectively support all these different data types, since they're the rule, not the exception, in most data ecosystems.
The Marketing vs Reality Gap
There’s a huge issue in the cyber startup ecosystem, and perhaps the startup ecosystem as a whole, of exaggerating capabilities or marketing features that just don’t exist. It can be very hard for someone to discern what companies actually do versus what they claim to do, and everything is gatekept behind a series of demo calls. In the data security space in particular, existing vendors, ranging from legacy DLP solutions to modern data governance and DSPM startups, claim to have solved the problem of data sprawl. They promise advanced data classification using “the power of GenAI” and automated remediation of data security issues. Their websites list dozens of use cases and features, serving as your one-stop-shop for data security. But these promises quickly crumble when applied to production data. Classifications deliver mixed results, software starts to break down when scanning gigabytes, let alone terabytes of data, and thousands of “CRITICAL” and “HIGH SEV” alerts get generated for issues that don’t even need to be addressed. Rather than check feature boxes and create noisy environments that bog down security and engineering teams, we’ve taken a different approach:
Our Thesis
We’re building Teleskope to automate data protection, from detection, to remediation, to prevention. And we’re breaking our approach down into three simple tenets:
- Accurate classifications and actionable insights are the building blocks for any data security program. Garbage in = garbage out.
- Automated remediation is the only way to enforce data protection at scale
- Proactive prevention, and integrating with developer tooling, is the only way to ensure you’re not playing catchup, and instead maintaining data protection by default.
We’re on a mission to build not only the best-in-class data protection platform, but also a transparent one. Stay tuned for more in-depth, technical blogs about all things data security and governance.
Introduction
Kyte unlocks the freedom to go places by delivering cars for any trip longer than a rideshare. As part of its goal to re-invent the car rental experience Kyte collects sensitive customer data, including driver’s licenses, delivery and return locations, and payments information. As Kyte continues to expand its customer base and implement new technologies to streamline operations, the challenge of ensuring data security becomes more intricate. Data is distributed across both internal cloud hosting as well as third party systems, making compliance with privacy regulations and data security paramount. Kyte initially attempted to address data labeling and customer data deletion manually, but this quickly became an untenable solution that could not scale with their business. Building such solutions in-house didn’t make sense either, as they would require constant updates to accommodate growing data volumes which would distract their engineers from their primary focus of transforming the rental car experience.
- list
- list
- list
- list
Continuous Data Discovery and Classification
In order to protect sensitive information, you first need to understand it, so one of Kyte’s primary objectives was to continuously discover and classify their data at scale. To meet this need, Teleskope deployed a single-tenant environment for Kyte, and integrated their third-party saas providers and multiple AWS accounts. Teleskope discovered and crawled Kyte’s entire data footprint, encompassing hundreds of terabytes in their AWS accounts, across a variety of data stores. Teleskope instantly classified Kyte’s entire data footprint, identifying over 100 distinct data entity types across hundreds of thousands of columns and objects. Beyond classifying data entity types, Teleskope also surfaced the data subjects associated with the entities, enabling Kyte to categorize customer, employee, surfer, and business metadata separately. This automated approach ensures that Kyte maintains an up-to-date data map detailing the personal and sensitive data throughout their environment, enabling them to maintain a structured and secure environment.
Securing Data Storage and Infrastructure
Another critical aspect of Kyte’s Teleskope deployment was ensuring the secure storage of data and maintaining proper infrastructure configuration, especially as engineers spun up new instances or made modifications to the underlying infrastructure. While crawling Kyte’s cloud environment, Teleskope conducted continuous analysis of their infrastructure configurations to ensure their data was secure and aligned with various privacy regulations and security frameworks, including CCPA and SOC2. Teleskope helped Kyte identify and fortify unencrypted data stores, correct overly permissive access, and clean up stale data stores that hadn’t been touched in a while. With Teleskope deployed, Kyte’s team will be alerted in real time if one of these issues surfaces again.
End-to-End Automation of Data Subject Rights Requests
Kyte was also focused on streamlining data subject rights (DSR) requests. Whereas their team previously performed this task manually and with workflows and forms, Kyte now uses Teleskope to automate data deletion and access requests across various data sources, including internal data stores like RDS, and their numerous third-party vendors such as Stripe, Rockerbox, Braze, and more. When a new DSR request is received, Teleskope seamlessly maps and identifies the user’s data across internal tables containing personal information, and triggers the necessary access or deletion query for that specific data store. Teleskope also ensures compliance by automatically enforcing the request with third-party vendors, either via API integration or email, in cases where third parties don’t expose an API endpoint.
Conclusion
With Teleskope, Kyte has been able to effectively mitigate risks and ensure compliance with evolving regulations as their data footprint expands. Teleskope reduced operational overhead related to security and compliance by 80%, by automating the manual processes and replacing outdated and ad-hoc scripts. Teleskope allows Kyte’s engineering team to focus on unlocking the freedom to go places through a tech-enabled car rental experience, and helps to build systems and software with a privacy-first mindset. These tangible outcomes allow Kyte to streamline their operations, enhance data security, and focus on building a great, secure product for their customers.
from our blog