APIs for Data Security and Privacy: Part One
Apigee was invited to submit testimony to a recent meeting of the API Task Force of the Office of the National Coordinator for Health Information Technology (known as the ONC). The Task Force posed a series of questions about the security of APIs and the ability of APIs to protect consumer data.
These questions are part of the panel’s effort to identify perceived and real privacy and security risks that could slow adoption of open APIs in healthcare. The Task Force will present a set of recommendations to the ONC to help consumers leverage APIs to access patient data securely. It is set to present these recommendations in April 2016.
This post, the first in a series, contains the initial part of my Task Force testimony.
In the last 10 years, we have seen the idea of a “web API” grow from an experiment created by a small number of web companies, to a technique used by popular social network mobile applications, to a mainstream set of technologies and best practices that are being used across the industry to make it easier for software developers to access data and services.
The very first web APIs may have been used for non-sensitive information such as weather forecasts and maps, but that quickly changed. Today we see APIs being used for everything from mobile payments to healthcare to wholesale financial services.
An API, at the simplest level, is a contract. The contract specifies how a software developer accesses an API, and tells the developer what to expect. A well-designed API makes this contract very clear through documentation and specifications that describe not only what kinds of requests the API expects, but what kind of security controls have been put in place and what set of security credentials a developer must acquire before she or he builds an application that uses the API.
Because an API is a contract, it is possible for the organization that offers the API to completely document and understand the interaction between the application that uses the API and the API itself. A variety of tools and techniques are available, both from commercial software vendors as well as from the open-source community, which can be used to ensure that API access is not allowed unless the client follows the contract. These tools may also be used to monitor API usage and gather data to understand exactly who is using the API and how.
This contract-driven interaction model makes it possible for the organization that provides an API to add policies and security controls on every interaction. An API team can therefore regulate which applications and end users are authorized to use an API and which parts of the API they are allowed to use.
The team can also control what an authorized user can do, including limits on the number of API calls that can be made, or when they can be made. Finally, the team can follow the trail of API calls to understand exactly what authorized API users did, and what unauthorized attempts may have been made.
As a result, APIs, rather than being a new security risk, provide a well-documented, popular way for organizations to share access to data and services with third parties, while maintaining strict security controls. Especially when compared to other ways of sharing this data, such as via web site, file transfer, email, or even printing press, a well-implemented API offers a stronger set of security controls.
Are there any well-known threats or vulnerabilities associated with APIs themselves that should be addressed (e.g. security engineering considerations/best practices)?
Today, web APIs are a mature and popular technology choice. As such, there are a variety of security best practices that API providers should follow, and a great deal of information on these topics is available from various books and blogs.
Some of these security best practices are similar to those used on the World Wide Web today, such as proper use of TLS to ensure that communications are properly encrypted, and attention to common vulnerabilities such as “SQL injection” attacks.
Other best practices are specific to the world of APIs. For instance, it is common for the best-designed APIs to use a system of quotas and rate limits to protect the API against excessive traffic, even if that traffic comes from a properly authorized application.
As APIs are gaining adoption, are there steps organizations need to take to mitigate any additional threat vectors to data?
Of course, this is important especially when an API is being deployed that gives access to data that was not previously available via some other channel.
In these cases, a layered approach is best. For instance:
Use existing network security best practices such as firewalls, intrusion detection sensors, routers, network design, and proper management of TLS to protect assets
Use rate limiting mechanisms to control the amount of API data that may be consumed from all users, even authorized users
Verify the identity of the application that consumes the API, as well as verifying the identity of the end user
Scan any incoming data for attack vectors such as SQL injection and other malformed input that may be designed to cause a target system to crash
These are all security best practices that are specific to APIs.
However, in many cases APIs are being offered not as a way to expose data that has never been exposed before, but as a new way to get access to data that was previously exposed using some other mechanism such as a web site, a file transfer service, or even email.
In these cases, we would argue that moving to an API approach from one of these technologies results in a system that is actually more secure than the alternatives.
For instance, web sites are subject to any number of “screen scraping” techniques. Any reasonably clever web developer can use these techniques to discover what data is available to the web app. Since a web app is a complex application with many entry points, it is often the case that a clever developer can figure out ways to get access to data that they might not be able to access merely through a web browser.
And even if a website is 100 percent bug-free and this kind of access is not possible, when developers “screen scrape” the web site in this way, the team that controls the site does not have a good mechanism to understand if it is happening, who is doing it, what they are doing, and how much data they have accessed.
A properly secured API, however, can include security techniques such as the ones described above that can ensure that only properly authorized applications, built by authorized developers, can access only the data that is exposed by the API.
Furthermore, the team providing the API can put traffic management features in place to control how much data each developer is allowed to access. And in the worst case, since the API was designed from scratch for easy access by developers, it is usually easier to use audit trails and logs to see exactly what was accessed, by whom, and for how long.
In the next post, I’ll answer questions about security precautions particular to APIs in the healthcare industry and ways to ensure that APIs aren't used for malicious activities.