Cloud Exposure, DLP & IR, A-Z
By: Mary Ellen Kennel, Security Engineer
last updated: 9/15/2018
https://twitter.com/icanhaspii
https://ManhattanMennonite.blogspot.com
https://www.linkedin.com/in/MaryEllenKennel
Copyright © 2018 by Mary Ellen Kennel. All Rights Reserved
ABOUT THE PAPER
Why This Document; Why Now? I was incredibly inspired by Ed Skoudis’ portion of the 2018 SANS RSA Keynote entitled, “The Five Most Dangerous New Attack Techniques”- https://www.sans.org/five. In his keynote, Ed talked about our increasing collaboration with cloud-based tools and repositories. Some examples were Amazon AWS/S3, Docker Hub, GitHub, Google Cloud and Microsoft Azure. Ed reminded us that we’ve seen some pretty serious “oopsies” from several high profile entities over the past year (GoDaddy[17][18][19], Time Warner[1], Uber[2], U.S. Army[3], Verizon[4]), and that data exposure can happen from something as mindless as a misconfiguration of a private repository marked as public or even a public repo mistakenly containing sensitive data. The talk was so popular, there’s since been a SANS follow-up webinar (also posted at the aforementioned link).
When Ed Skoudis speaks, we must listen. Ed was recently ranked the #1 security advisor and influencer to CISO’s[5] and there’s an endless number of reasons why. All three speakers at the keynote were wonderful; however this paper highlights more of Ed’s portion of the panel.
Additionally, a former manager of mine recently asked me if I had any resources around Incident Response in the cloud that I could share. I would like to thank him now (without stating his name) for also being the impetus behind my assembling all of these notes and pushing them out. In his RSA talk, Ed gave us plenty of Cloud IR suggestions, but I’ve found a few nuggets that I’ll throw into the ring as well. Hopefully there’s enough info in this paper to get you started thinking about what your cloud footprint and exposure might be, and ideally you’ll be motivated to embark down your own path that will yield even more wisdom.
Lastly, but certainly NOT the least, I was blown away by my friend (and my AboutDFIR.com partner) Devon Ackerman’s talk regarding cloud forensics, when he was a guest on David Cowen’s “Forensic Lunch” – https://www.youtube.com/watch?v=WgRxPCofIrA. Devon’s talk inspired me to strive to learn all that I could about “the cloud.”
All of the above factors contributed to my wanting to assemble something to share with our amazingly wonderful DFIR community. There’s no doubt that I’ve missed many sources as the Cloud IR space seems to be on fire right now, so feel free to use this document as a guide to get you started, but with the understanding that there’s certainly way more information out there. I do NOT claim to be an expert authority on anything discussed here, but rather an interested party who spun up a free AWS account six months ago to poke around, and plans to do the same with both Azure and Google down the road.
Finally, please forgive me if you’re a vendor and I’ve missed you, misrepresented you, or even misquoted you. There are plenty of vendor haters in our industry; I don’t believe I’m one of them. If you do a recursive search on my Twitter posts and other articles, I believe I have tried to be fair…dare I even say kind, to vendors.
A few words of thanks, and then we’ll begin. To all the readers, whether you are still in college and hoping to get into the field, or a seasoned veteran, thanks to each and every one of you for taking the time to read what I’ve written, and for not being afraid to ask questions. Some of my more popular and well-received posts have been inspired by friends who have asked me questions, and it always brings me such joy when I can publish something that might help others. I have immeasurable gratitude for Devon Ackerman, he’s been a wonderful friend and confidant. David Cowen, thank you for everything you do for our community, and for always being open to my asking for advice. To Harlan Carvey for suggesting after my last paper that if you’re going to list a tool, don’t just list its name, write a little something about it. Thank you. Paul Lewis, I love working with you, you make every work-day fun! Thank you for your help with the title. Lastly, to my neighbors, Gireesh and Natalie for allowing me to bug them with a few questions regarding gitignore files.
INTRODUCTION
Important Disclaimer:
Right before posting this, I waffled back and forth about leaving certain results of my research intact vs. hitting the “delete” key on my keyboard. I decided that I would leave the items in, but not without adding a disclaimer. A few of the resources I found over the course of my research were downright abominable. Sometimes as researchers, investigators, and truth seekers, we unfortunately come across online content that we wish we could wipe from our memory. Some of the things people paste are just awful…I’ll leave my disclaimer at that.
The approach I’ve taken is along the lines of what Devon Ackerman and I try to do with our AboutDFIR.com resource. An itemized and organized collection, this time of a whole host of cloud and code/data repository nuggets that are aimed at helping you find your data leaks (and trust me, you have them) before anyone else does, and then how to handle those incidents when you find them.
CLOUD EXPOSURE & DLP
I’m guessing that GitHub and PasteBin are possibly the most well-known code/data repos, but there are plenty of others. I’ll list a few below to give you an idea of the expansiveness of the number. Some of the sites below have a “Search” button which allows you to troll other user’s content, while some of the others listed below are potential risks due to possible exposure should they become compromised. Be careful not to commit improper access control, and if for some reason a developer simply must store code publicly, make sure they scrub it first. I’ve actually experienced helping a client with an exposure concern where a developer thought they were pushing sensitive code to their internal enterprise instance of GitHub but were in fact publishing it externally, completely by mistake and unbeknownst to the developer.
A Few Repo Examples:
- BitBucket: https://BitBucket.org
- CircleCI: https://CircleCI.com
- CodePad: http://CodePad.org
- CodeShip: https://codeship.com
- Dumpz: https://dumpz.org
- GhostBin: https://GhostBin.com
- Plug into search engine: “GhostBin.com password” and you might occasionally find things.
- GitHub: https://GitHub.com
- GitHub has a pretty decent search component already baked in.
- GitHub.io/GitHubPages: https://pages.github.com
- GitHub.com: https://gist.github.com
- GitLab: https://About.GitLab.com
- Grammarly: https://www.Grammarly.com
- I’m sure the enterprise version can be locked down, but if not blocked, their Chrome extension does allow access from a browser so one could paste files from one location and then possibly access them from another.
- ImGur: https://imgur.com
- JSBeautifier: http://JSbeautifier.org
- JustPasteIt: https://JustPaste.it
- LightShot: https://prnt.sc
- LucidChart: https://www.LucidChart.com
- You might be thinking why this one, but I wonder about all the network diagrams they could be housing.
- xyz: https://NoPaste.xyz
- PasteBin: https://PasteBin.com
- PasteBin has a pretty good built-in search function.
- On any given day, visit and search for “password” and you’re bound to find creds of some sort.
- Paste.ee: https://paste.ee
- PasteGuru: http://pasteguru.com
- Phacility: https://www.phacility.com/phabricator
- StickyNotes: https://Paste.is
- PastieBin: https://www.PastieBin.com
- PasteGuru: http://PasteGuru.com
- SourceForge: https://sourceforge.net
- NoPasteTightDev.net: http://NoPaste.TightDev.net
- Trello: https://Trello.com
- See Brian Krebs’ Article: https://krebsonsecurity.com/2018/06/further-down-the-trello-rabbit-hole
Multiple Repo Listings:
- No Paste: https://php.earth/docs/interop/nopaste
- TehSausage: http://tehsausage.com/paste/never-dies
- IntelTechniques by Michael Bazzell: https://inteltechniques.com/osint/pastebins.html
- Allows you to search 57 paste sites at once.
Popular Cloud Based Services:
- Amazon Web Services (AWS):
- Debunking the myth: S3 Buckets are exposed by default:
- “By default, the public access to this bucket is disabled.”[6]
- “First off, there are default security settings on S3 instances, so the information had to be exposed deliberately…”[7]
- Free Trial: https://aws.amazon.com
- Debunking the myth: S3 Buckets are exposed by default:
- Docker Hub: https://Hub.Docker.com
- Google Cloud Computing:
- Free Trial: https://Cloud.Google.com
- IBM Cloud Computing:
- Free Trial: https://www.ibm.com/cloud/info/save-on-servers
- Microsoft Azure: https://Azure.Microsoft.com/en-us
- Oracle Cloud Computing:
- Free Trial: https://cloud.oracle.com/home
Other Examples:
Note: This is a good time to point out that there is a whole other online universe available via Tor, Onion, etc. I don’t cover much of that here, but I’ve included the following for awareness.
- DoxBin: (Seems to be defunct but archive is still available.)
PostIts Onion:
- A sort of paste site for Tor.
Multiple Onion List: (A bit off-topic but tangentially related so I’ve left in, most links are OLD.)
Probably Worth Monitoring – Question and Answer Boards:
What about question and answer boards…are they locked down? Are people sharing more than they intend to when they paste in their problematic system errors or workplace rants? Are those maybe worth monitoring within (or for) your organization? The following is a list to get you thinking along that vein:
- IRC: http://www.irchelp.org
- GlassCeiling: http://glassceiling.com
- SpiceWorks: https://www.spiceworks.com
- StackExchange: https://Workplace.StackExchange.com
- StackOverflow: https://stackoverflow.com
Misc Amazing Find From an Anonymous Tip:
- “Bringing a Machete to the Amazon” by Erik Peterson
- https://www.softwaretalks.io/v/3634/bringing-a-machete-to-the-amazon
YOUR DIGITAL FOOTPRINT
How vast is your digital footprint? Do you even know?
There’s a silly saying, “What happens in Vegas, stays in Vegas.” Well, similarly, what happens online stays online, the difference is, it may be viewed around the globe. As an example, I live near a fairly well known couple who get their picture taken a lot. When they asked me my line of work, they were intrigued because they were interested in how to better protect their online profile and minimize their digital footprint. A lot of high-profile individuals purchase their homes and property using the name of their production company instead of their real names, but this particular couple had not done that. Maybe they weren’t as well known when they purchased their property, or perhaps just weren’t made aware of the footprint it might leave, I’m not exactly sure, but I figured I wouldn’t ask.
When I got home after our conversation, I searched online for their names, and their home address was one of the first hits returned. As a favor, I went ahead and began requesting takedowns from each of the numerous sites hosting their personal info, and eventually, with a lot of perseverance, I had pretty much scrubbed both of their search results so that all that was returned was their awards, TV shows, movies, etc.
I say all of this not to come across as a “holier than though” jerk…I had a 12 year career as a television executive before I ever got into DFIR, and trust me, I’ve met just about every celebrity you can name and they are people just like us, it’s not about that, it’s about demonstrating how even simple every-day life can leave you exposed.
Getting both of their personal footprints minimized took me roughly 12 hours of work over the course of 2.5 weeks. There was a lot of back-and-forth with site owners, and one of the domains took three requests until it removed their home address.
Below are a few sites that might help you discover bytes of information that you thought had previously been erased.
Archived Pages:
- Archive.li: https://archive.li/ghostbin.com
- Archive.org: https://archive.org
- Cached Pages: http://www.cachedpages.com
- CachedPage.co: https://cachedpage.co
- Google Cache Formula:
“You can access the cached version for any page that has been saved by Google with this:
http://webcache.googleusercontent.com/search?q=cache:http://example.com/
Change http://example.com/ to any URL. You can also create a custom search engine on Chrome or a Firefox keyword to go to cached versions automatically by adding a keyword before the current URL address.”[8]
“A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors” by
- Dimitrios Sikeridis, Ioannis Papapanagiotou, Bhaskar Prasad Rimal, and Michael Devetsikiotis
- Opens a PDF File: https://arxiv.org/pdf/1710.01476.pdf
ASSET INVENTORY
Ed talked about how important asset inventory is. He went as far to suggest that, depending on the size of your organization, perhaps you even hire a Data Curator and educate your developers and architects to interact with them as a key stakeholder. I keep reading articles stating, “You can’t secure what you don’t know you have” and that concept is so simple, so 101, but it really seems to be the Achilles heel of our industry right now, in so many ways…in my humble opinion.
- Inventory of computer systems
- Inventory of your data
- Senrio: https://blog.senr.io/blog/introducing-senrio-discovery
- BitDiscovery: https://bitdiscovery.com
I recently came across this pretty cool start-up run by industry giant Jeremiah Grossman. He’s got an amazing company that can help you get your arms around your organization’s online presence. Do you work for a medium or large-sized org and think you know all of your domain names? I guarantee you you’ll change your answer once you talk to Jeremiah! How do I know that? Because if you answered “yes”, he’ll prove you wrong, and if you answered “no”, once you run his tool, that “no” will instantaneously become a “yes”!
If BitDiscovery’s got your Web inventory covered (unless you’d rather spend an average of two months’ worth of lunch breaks in front of an Excel spreadsheet while performing recon – I wouldn’t know anything about that!), what about some of the aforementioned code repository sites that you don’t own and might not fall within your purview. Maybe you place controls around some of them? Or, if you can’t (or don’t want to for business reasons) block them, maybe instead you set up some specialized monitoring around them?
I took SANS SEC555 in April with Justin Henderson, and his Twitter handle isn’t @SecurityMapper for nothing. Dude knows his stuff! And, he can help you find things on your network you never knew existed. No, his time isn’t free, but coincidently I am friends with someone who also took Justin’s class and hired him afterward to help map out his company’s critical assets, and he was recently singing Justin’s praises to me. Full disclosure, Justin and I started out as SANS “Work Studies” together, so I am friendly with Justin, but even if I didn’t know him the little bit that I do, I would be telling you the same thing. His class will ROCK. YOUR. WORLD. Don’t take it if you’re not prepared to have a “SANS Moment”. What’s that you ask? It’s the experience of sitting in a SANS class and either watching the person next to you (or yourself), jump up, lock their screen, grab their phone, and run for the door because the instructor has just dropped a critical piece of knowledge that requires them to assemble an all-hands-on-deck moment back at their enterprise.
DISCOVERY
So how do you know if your code or data is sitting out there, exposed?
Free Resources:
“have I been pwned” database:
For credentials, you can always check Troy Hunt’s “have I been pwned” cache.
CheckMyDump by Ryan Moon:
“I am the Check My Dump robot, I post interesting things I find to twitter.”
DumpMon by Jordan Wright:
“A bot which monitors multiple paste sites for password dumps and other sensitive information.”
Stream Sniffer PasteBin Troll:
Doesn’t appear to be maintained, but I’ve heard this type of search is tough to get right, so perhaps someone could locate the owner and help get it back up and running?
Poke around on your own:
- Amazon AWS
Bucket name is defined by everything before the “S3.”
https://www.youtube.com/watch?v=_x5VKuFjvrk
If a directory is open, pick a <Key> entry and place it with a “/” after the AWS root, such as:
root=http://<name of AWS account>.s3.amazonaws.com
Key=<Full path of key>
Full URL= http://<name of AWS bucket>.s3.amazonaws.com/<Full path of key>
“Git For Hackers” talk by rtzq0: https://www.layerone.org/speakers/#rtzq0
“While a lot of noise has been made about sensitive artifacts being left in git, and several tools exist to analyze git repositories for deleted files, a deep understanding of where and how git stores things (and therefore how such artifacts are being retained) eludes many. NO MORE! Rtzq0 will teach you how git functions, and in doing so you will come to understand the many ways in which things that people think have been deleted have not actually been deleted (and therefore how you might obtain them). Vivent les accidentally committed private keys!”
- https://www.layerone.org/speakers/#rtzq0
- https://gitlab.com/dc562/talks/tree/master/Rtzq0/git_for_hackers
- Note: Once downloaded, hit the forward key to advance through deck.
- https://www.layerone.org/archives (video is not posted yet but hopefully soon)
PasteHunter by Kevin Breen:
“Fair warning. The rules can be prone to false positives and don’t trust the value of the data any more than you trust the person who is uploading it to PasteBin in the first place.”
Hunting PasteBin for Fun and for Profit by Kevin Breen – SANS CyberThreat Summit 2018:
- Downloads a Zip File: https://www.sans.org/summit-archives/file/summit_archive_1526915577.zip
- https://techanarchy.net/2017/09/hunting-pastebin-with-pastehunter
PastaBean by Rory (@Tu5k4rr):
“Python Script to Scrape PasteBin with Regex. This is by far NOT a ‘finished project’ and plan to improve this over time. My goal is to make PastaBean as flexible as I can and simple to run with minimal requirements to capture data.”
- https://github.com/Tu5k4rr/PastaBean
- Author wrote me to add: “Thanks for the mention. I do plan to improve the alerting function and install script soon for my project.”
PasteBin Parser by Andrew Nohawk: (Doesn’t appear to be maintained)
“It lets you enter a query, which it runs against several pastebin sites using a variety of techniques. Andrew also makes the tool available for download, if you want to install it locally and customize it for your needs.”[9]
PasteBin Scraper by Andrew Nohawk: (Doesn’t appear to be maintained)
“Think of it as a means of searching various pastebins for information.”
PasteLert by Andrew Nohawk: (Doesn’t appear to be maintained)
“…allows you to set up alerts, so you get notified when the monitored PasteBin sites publish content that matches the desired keywords.”[9]
PasteNum by the CoreLAN Team and @shadowbq: (Doesn’t appear to be maintained)
“Pastenum is a text dump enumeration tool. It is designed to help find and enumerate datadumps, and doxs posted on public sites. It currently searches sites github.com, gist.github.com, pastebin.com, pastee.org, and pastie.org. Pastenum is a gem rewrite of nullthreat’s original pastenum2 released in 2011.”
- https://github.com/shadowbq/pastenum
- http://www.nullthreat.net/2011/06/updated-pastenum.html
- https://www.corelan.be/index.php/2011/03/22/pastenum-pastebinpastie-enumeration-tool
- http://web.archive.org/web/20110817234811/http://redmine.corelan.be:8800/projects/corelan-pastenum/files
- https://github.com/corelanc0d3r
- https://www.youtube.com/watch?v=Lim3YTzL1f8
Paid Services:
There are companies you can pay to run keyword searches across paste and repo sites. I’ve listed a few below:
Anomali:
- Anomali can run keywords across their intel which includes compromised email addresses that were pulled from dump sites and known compromised paste sites.
- https://www.anomali.com
Digital Shadows: https://www.digitalshadows.com/digital-risk-solutions/data-exposure
FlashPoint:
- It’s my understanding that FlashPoint can perform keyword alerting on dark web forums.
- https://www.flashpoint-intel.com
FusionX:
LookingGlass:
Recorded Future:
- RecordedFuture can alert on social media sites, public repos, and general Web sites.
- https://www.recordedfuture.com
Terbium Labs:
Various PenTest companies will also perform that type of discovery service for you.
Note: I recall that a former co-worker wrote custom scripts to crawl IRC and paste sites. Some paste sites may offer premium subscriptions that allow you to crawl them.
Some Tools from Ed’s Talk:
Git-Seekret by Albert Puigsech Galicia:
- “Git module to prevent from committing sensitive information into the repository.”
- https://github.com/apuigsech/git-seekret
Git-Secrets by Michael T. Dowling:
- “Prevents you from committing passwords and other sensitive information to a git repository.”
- https://github.com/awslabs/git-secrets
GitRob by Michael Henriksen:
“Gitrob is a command line tool which can help organizations and security professionals find sensitive information lingering in publicly available files on GitHub. The tool will iterate over all public organization and member repositories and match filenames against a range of patterns for files that typically contain sensitive or dangerous information.”
- https://github.com/michenriksen/gitrob
- http://michenriksen.com/blog/gitrob-putting-the-open-source-in-osint
Macie (Amazon S3):
“Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Amazon Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or moved.”
Some Other Interesting Stuff I Found:
BucketStream by Paul Price and David Prince:
“Find interesting Amazon S3 Buckets by watching certificate transparency logs. This tool simply listens to various certificate transparency logs (via certstream) and attempts to find public S3 buckets from permutations of the certificates domain name.”
BuckHacker:[10][11][12]
“A custom search engine for Amazon S3 Buckets that those companies left open to the public!”
- http://www.BuckHacker.com (Currently Offline)
- https://twitter.com/TheBuckHacker
- https://thebuckhacker.com
- https://medium.com/@thebuckhacker/the-pandora-bucket-unleashed-e7fac79cbe19
TheBuckHacker:
- https://TheBuckHacker.com/blog
- https://twitter.com/BuckHacker
- https://thebuckhacker.com/blog/2018/08/27/online-bashing-and-further-news
GrayHat Warfare Public Buckets:
NetChecker by Arnim Eijkhoudt:
“A Tool/script to sift through dumps”
Search Google (or your favorite browser):
- Business units, host names, etc., which will yield results from repo sites that might not have great built-in search functionality natively.
Google keyword alerting:
- You can set up your own keyword alerts in Google.
SearchCode.com:
- Offers searches for code repositories and snippets across GitHub and similar sites.
- https://searchcode.com
“Allow S3 Tests Against Buckets in Other Regions” by Andrew White:
“Only us-east-1 gives URLs like bucket.s3.amazonaws.com whereas other regions have URLs like s3-eu-west-1.amazonaws.com/ubxd-rails”
Andrew’s quote begs the question that is answered by the following documentation from Amazon regarding regional S3 buckets, i.e., if you’re in the U.S., don’t forget there’s a whole world outside your window! We are a global economy and there are TONS of S3 buckets across the globe: https://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
Below is a table with Amazon S3 regions and corresponding website endpoints:
Region | Website Endpoint |
US East (N. Virginia) | s3-website-us-east-1.amazonaws.com |
US East (Ohio) | s3-website-us-east-2.amazonaws.com |
US West (N. California) | s3-website-us-west-1.amazonaws.com |
US West (Oregon) | s3-website-us-west-2.amazonaws.com |
Canada (Central) | s3-website.ca-central-1.amazonaws.com |
EU (Ireland) | s3-website-eu-west-1.amazonaws.com |
EU (London) | s3-website-eu-west-2.amazonaws.com |
EU (Paris) | s3-website-eu-west-3.amazonaws.com |
EU (Frankfurt) | s3-website.eu-central-1.amazonaws.com |
Asia Pacific (Mumbai) | s3-website.ap-south-1.amazonaws.com |
Asia Pacific (Singapore) | s3-website-ap-southeast-1.amazonaws.com |
Asia Pacific (Sydney) | s3-website-ap-southeast-2.amazonaws.com |
Asia Pacific (Tokyo) | s3-website-ap-northeast-1.amazonaws.com |
Asia Pacific (Seoul) | s3-website.ap-northeast-2.amazonaws.com |
South America (Sao Paulo) | s3-website-sa-east-1.amazonaws.com |
AWS GovCloud (US) | s3-website-us-gov-west-1.amazonaws.com |
CloudMapper by Duo Labs (The Security Research Team at Duo Security):
Azure SQL Threat Detection:
- https://azure.microsoft.com/en-us/roadmap/azure-sql-database-security-threat-detection
- https://docs.microsoft.com/en-us/azure/sql-database/sql-database-threat-detection
Azure Security by Tanya Janca: (Hoping a video gets posted)
Azure Active Directory Leaks by Mike Felch:
https://www.blackhillsinfosec.com/red-teaming-microsoft-part-1-active-directory-leaks-via-azure
Google Cloud Data Loss Prevention API:
Head in the Clouds by Christopher Maddalena:
“Head in the Clouds” – Security considerations around AWS, Azure, and GCP with some very actionable cheat sheets on doing some of the legwork yourself.”
***“My Arsenal of AWS Security Tools” by Toni de la Fuente: (And boy is it ever!)***
“List of open source tools for AWS security: defensive, offensive, auditing, DFIR, etc. I’ve been using and collecting a list of helpful tools for AWS security. This list is about the ones that I have tried at least once and I think they are good to look at for your own benefit and most important: to make your AWS cloud environment more secure.”
“Reserved Words” of Programming Languages by Orest Ivasiv:
- Many programming languages have words that can be considered unique to their code, or words that can help identify their scripts. The following site breaks down many of those, which may help you identify leakage when driving keyword searches across various repos.
- http://halyph.com/blog/2016/11/28/prog-lang-reserved-words.html
Some “Reserved Words” for JavaScript:
- Search Gitignore[13]
Note: Gitignore files cover an area I know very little about, so consider what follows to be more hypothetical questions.
- I believe Gitignore can be used for a number of search use-cases. One that I can think of is that it’s a good repository of possible “Reserved Words” for many programming languages.
- Remember the old robots.txt[14] file and how Web site owners used to load-up that file with where on the site all their crown jewels were because they didn’t want those crawled? Could the gitignore file reveal similar bread crumbs?
- If gitignore files contain files we don’t necessarily want to commit, could there ever be an instance where credentials are found in there?
- Lastly, could a gitignore file include “hidden” code in the wild that might not turn up during a routine repo search?
- https://github.com/github/gitignore
Counting Lines of Code:
- Might help you correlate a findings match by knowing the size of a binary you might be trying to isolate in the wild? Maybe not, but I’ll include some info here around counting lines of code in the off-chance it’s helpful:
- OhCount is a library for counting lines of source code: https://github.com/blackducksoftware/ohcount
- OhCount generates reports at: https://www.openhub.net
- Used to be OhLoh: https://www.openhub.net/p/ohloh
Source Code Search Engine:
- https://publicwww.com
“Search for any #HTML, #JavaScript, #CSS and plaintext in web page source code and download a list of websites that contain it.” The authors describe it as a search engine for the following:
o Any HTML, JavaScript, CSS and plain text in web page source code.
o References to StackOverflow questions in HTML, .CSS and .JS files.
o Web designers and developers who hate IE.
o Sites with the same analytics id: “UA-19778070-“.
o Sites using the following version of nginx: “Server: nginx/1.4.7”.
o Advertising networks users: “adserver.adtech.de”.
o Sites using same adsense account: “pub-9533414948433288”.
o WordPress with theme: “/wp-content/themes/twentysixteen/”.
o Find related websites through the unique HTML codes they share, i.e. widgets and publisher IDs.
o Identify sites using certain images or badges.
o Find out who else is using your theme.
o Identify sites that mention you.
o References to use a library or a platform.
o Find code examples on the internet.
o Figure out who is using what JS widgets on their sites.
GitHub Linguist:
“Linguist takes the list of languages it knows from languages.yml and uses a number of methods to try and determine the language used by each file, and the overall repository breakdown.” It’s A library to detect blob languages, help you ignore vendor files and generate language breakdown graphs.
Evident:
“The new 2018 Cloud Security Report reveals that security concerns are on the rise, heightened by a shortage of qualified security personnel and the inability of most legacy security tools to address modern IT environments.”
Snyk:
“Stay secure! Get alerted when new vulnerabilities are found in your dependencies, and automatic pull requests when a fix is available.”
Cloud Inquisitor:
“Enforce ownership and data security within AWS”
Security Tools for AWS – A Wonderful Resource by Mark Hillick:
“I often get asked which tools are good to use for securing your AWS infrastructure so I figured I’d write a short list of some useful Security Tools for the AWS Cloud Infrastructure. This list is not intended be something completely exhaustive, more so provide a good launching pad for someone as they dig into AWS and want to make it secure from the start.”
CLOUD INCIDENT RESPONSE
Office365 IR:
- My AboutDFIR.com partner Devon Ackerman: (2018 Forensic Investigator of the Year!)[15]
- Devon as a guest David Cowen’s Forensic Lunch: https://www.youtube.com/watch?v=WgRxPCofIrA
- Devon’s SANS DFIR Summit 2018 Presentation:
- Devon’s video will probably be posted on the SANS Summit YouTube site eventually:
Microsoft’s dedicated Office365 Blog:
https://techcommunity.microsoft.com/t5/Office-365-Blog/bg-p/Office365Blog
Adam Harrison’s Well-Received WriteUp:
Magnet Forensics’ AXIOM tool has integrated support for cloud forensics:
“ACQUIRING & PROCESSING CLOUD EVIDENCE – Learn how to acquire and process cloud evidence such as Gmail, O365, Facebook, Twitter, OneDrive, Dropbox, Box.com.”
Office365 Log Analysis Framework (OLAF) by Matt Bromiley:
“OLAF is a collection of tools, scripts, and analysis techniques dealing with O365 Investigations.”
O365 Lockdown (Hardening) by LMG Security:
“This is a simple Powershell script that can be used to better secure an Office 365 environment and enable audit logging. It is offered under the BSD license.”
Other Cloud IR:
Detecting Credential Compromise in AWS by Will Bengtson:
Amazing Scott Piper: (@0xdabbad00) – he runs the following:
- Cloud CTF: http://flaws.cloud
- Summit Route: https://summitroute.com/blog/2017/05/30/free_tools_for_auditing_the_security_of_an_aws_account
- You will learn best practices for the following…
AWS Trusted Advisor:
“The AWS Trusted Advisor service was released in July, 2014 and comes free with your AWS account and provides not only security checks, but also cost optimization, performance, and fault tolerance checks.”
CloudSploit:
“AWS security scanning checks – CloudSploit scans is an open-source project designed to allow detection of security risks in an AWS account. These scripts are designed to run against an AWS account and return a series of potential misconfigurations and security risks.”
Edda:
“Edda is a Service to track changes in your cloud deployments.”
Scout2:
“In 2012, Loïc Simon at iSEC Partners (now part of NCC Group) released a tool called Scout for auditing AWS environments. In 2014, they released a new version named Scout2. The open-source Scout2 project is focused toward pentesters doing one-time audits.”
Prowler:
“Toni de la Fuente (@ToniBlyx) at Alfresco released prowler in September, 2016 which was made to check the items from the CIS Amazon Web Services Foundations Benchmark. This tool is solely focused on the issues from that report, and is the only tool that generates a single report (albeit console based) to read through as opposed to an application you need to know how to click through and use.”
ReddAlert:
“Prezi’s reddalert was released in 2014 and uses Netflix’s Edda (discussed earlier in the AWS Config section), as opposed to the AWS API directly like the other tools. reddalert no longer appears to be maintained.”
Security Monkey:
“Netflix’s Security Monkey was released back in 2014 and detects issues both for AWS and Google Cloud Platform (GCP). Security Monkey is expected to be deployed as an entire EC2 and needs a PostgreSQL backend. It can repeatedly scan multiple accounts and generate alerts. Once alerted, security teams can browse a list of issues in its UI to review and justify or remediate.”
Cloud Custodian:
“CapitalOne’s Cloud Custodian was introduced in May, 2016. It doesn’t just detect issues like the other tools, but actually enforces compliance with an organization’s rules. It does this via the heavy handed method of in some cases, simply killing anything that isn’t in compliance.”
Jonathon Poling – IR in the Cloud SANS DFIR Summit 2018:
- https://www.sans.org/summit-archives/file/summit_archive_1528749610.pdf
- Jonathon’s video will probably be posted on the SANS Summit YouTube site eventually:
- Jonathon’s 2017 SANS DFIR Summit talk can be viewed at the link below:
You will learn best practices for the following…
AWS CloudFront:
“Amazon CloudFront is a global content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to your viewers with low latency and high transfer speeds. CloudFront is integrated with AWS – including physical locations that are directly connected to the AWS global infrastructure, as well as software that works seamlessly with services including AWS Shield for DDoS mitigation, Amazon S3, Elastic Load Balancing or Amazon EC2 as origins for your applications, and Lambda@Edge to run custom code close to your viewers.”
AWS CloudTrail:
“AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. This event history simplifies security analysis, resource change tracking, and troubleshooting.”
AWS CloudWatch:
“Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically react to changes in your AWS resources. Amazon CloudWatch can monitor AWS resources such as Amazon EC2 instances, Amazon DynamoDB tables, and Amazon RDS DB instances, as well as custom metrics generated by your applications and services, and any log files your applications generate. You can use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance, and operational health. You can use these insights to react and keep your application running smoothly.”
AWS Config:
“AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. With Config, you can review changes in configurations and relationships between AWS resources, dive into detailed resource configuration histories, and determine your overall compliance against the configurations specified in your internal guidelines. This enables you to simplify compliance auditing, security analysis, change management, and operational troubleshooting.”
AWS ELB (Elastic Load Balancing):
“Elastic Load Balancing automatically distributes incoming application traffic across multiple targets, such as Amazon EC2 instances, containers, and IP addresses. It can handle the varying load of your application traffic in a single Availability Zone or across multiple Availability Zones. Elastic Load Balancing offers three types of load balancers that all feature the high availability, automatic scaling, and robust security necessary to make your applications fault tolerant.”
AWS GuardDuty:
“Amazon GuardDuty is a managed threat detection service that continuously monitors for malicious or unauthorized behavior to help you protect your AWS accounts and workloads. It monitors for activity such as unusual API calls or potentially unauthorized deployments that indicate a possible account compromise. GuardDuty also detects potentially compromised instances or reconnaissance by attackers.”
AWS S3:
“Companies today need the ability to simply and securely collect, store, and analyze their data at a massive scale. Amazon S3 is object storage built to store and retrieve any amount of data from anywhere – web sites and mobile apps, corporate applications, and data from IoT sensors or devices. It is designed to deliver 99.999999999% durability, and stores data for millions of applications used by market leaders in every industry. S3 provides comprehensive security and compliance capabilities that meet even the most stringent regulatory requirements. It gives customers flexibility in the way they manage data for cost optimization, access control, and compliance. S3 provides query-in-place functionality, allowing you to run powerful analytics directly on your data at rest in S3. And Amazon S3 is the most supported cloud storage service available, with integration from the largest community of third-party solutions, systems integrator partners, and other AWS services.”
AWS VPC Flow:
“VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data is stored using Amazon CloudWatch Logs. After you’ve created a flow log, you can view and retrieve its data in Amazon CloudWatch Logs.”
Hunting on AWS by Alex Maestretti and Forest Monsen SANS Threat Hunting Summit 2017 – Contains some live response material: https://www.youtube.com/watch?v=LRxxN3KGLYo
- You will learn best practices for the following…
AWS Athena:
“Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.”
AWS ALB (Application Load Balancer):
“Application Load Balancer operates at the request level (layer 7), routing traffic to targets – EC2 instances, containers and IP addresses based on the content of the request. Ideal for advanced load balancing of HTTP and HTTPS traffic, Application Load Balancer provides advanced request routing targeted at delivery of modern application architectures, including microservices and container-based applications. Application Load Balancer simplifies and improves the security of your application, by ensuring that the latest SSL/TLS ciphers and protocols are used at all times.”
AWS CloudTrail: (already defined above)
AWS ELB: (already defined above):
AWS Lambda:
“AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security. AWS Lambda can automatically run code in response to multiple events, such as HTTP requests via Amazon API Gateway, modifications to objects in Amazon S3 buckets, table updates in Amazon DynamoDB, and state transitions in AWS Step Functions.”
AWS WAF (Web Application Firewall):
“AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources. AWS WAF gives you control over which traffic to allow or block to your web applications by defining customizable web security rules. You can use AWS WAF to create custom rules that block common attack patterns, such as SQL injection or cross-site scripting, and rules that are designed for your specific application. New rules can be deployed within minutes, letting you respond quickly to changing traffic patterns. Also, AWS WAF includes a full-featured API that you can use to automate the creation, deployment, and maintenance of web security rules.”
Lacework:
“Automated Security for AWS. S3 Protection, Compliance, and more.”
Containers at Risk: A review of 21,000 Cloud Environments: https://info.lacework.com/containers-at-risk-cloud-environments-review
ThreatResponse:
“A Free Open Source Security Suite for Hardening and Responding in AWS”
Pacu by Rhino Security (Thanks to @DAkacki for sharing with me):
“Pacu is an open source AWS exploitation framework, designed for offensive security testing against cloud environments. Created and maintained by Rhino Security Labs, Pacu allows penetration testers to exploit configuration flaws within an AWS account, using modules to easily expand its functionality. Current modules enable a range of attacks, including user privilege escalation, backdooring of IAM users, attacking vulnerable Lambda functions, and much more.”
- https://github.com/RhinoSecurityLabs/pacu
- https://player.fm/series/brakeing-down-security-podcast-2391615/ep-2018-024-pacu-a-tool-for-pentesting-aws-environments
CloudGoat by Rhino Security:
“CloudGoat deploys intentionally vulnerable AWS resources into your account. DO NOT deploy CloudGoat in a production environment or alongside any sensitive AWS resources.“
CloudFire by Rhino Security:
“CloudFire focuses on discovering potential IP’s leaking from behind cloud-proxied services, e.g. Cloudflare. Although there are many ways to tackle this task, we are focusing right now on CrimeFlare database lookups, search engine scraping and other enumeration techniques.”
BucketHead by Rhino Security:
“buckethead.py searches across every AWS region for a variety of bucket names based on a domain name, subdomains, affixes given and more. Currently the tool will only present to you whether or not the bucket exists or if they’re listable. If the bucket is listable, then further interrogation of the resource can be done. It does not attempt download or upload permissions currently but could be added as a module in the future. You will need the awscli to run this tool as this is a python wrapper around this tool.”
AWS Privilege Escalation Scanner by Rhino Security:
“Using the script, it is possible to detect what users have access to what privilege escalation methods in an AWS environment. It can be run against any single user or every user in the account if the access keys being used have IAM read access. Results output is in csv, including a breakdown of users scanned and the privilege escalation methods they are vulnerable to.”
- https://github.com/RhinoSecurityLabs/Security-Research/blob/master/tools/aws-pentest-tools/aws_escalate.py
- https://rhinosecuritylabs.com/aws/aws-privilege-escalation-methods-mitigation (Spencer Gietzen)
AWS Best Practices by Andreas Chatzakis:
- Opens a PDF File: https://d1.awsstatic.com/whitepapers/AWS_Cloud_Best_Practices.pdf
Ricky Aldridge’s Helpful LinkedIn Post:
Microsoft has a dedicated Azure Security Blog:
AWS reInvent 2017 Incident Response in the Cloud by Jim Jennis and Conrad Fernandes:
AWS Sumit 2017 Incident Response in the Cloud by Jim Jennis and Conrad Fernandes:
BIG DATA
In his RSA keynote, Ed Talked about “The NetFlix Prize” and how if all you have are a few nuggets of information, it’s possible to triangulate and pivot, then whittle down even extremely large data sets to find your target. Below is a bit more about that. I thought it was cool that it appears Kaggle still has the original data set online:
- https://en.wikipedia.org/wiki/Netflix_Prize
- https://www.netflixprize.com
- https://www.netflixprize.com/community/topic_1537.html
- https://www.kaggle.com/netflix-inc/netflix-prize-data
- https://www.wired.com/2009/09/bellkors-pragmatic-chaos-wins-1-million-netflix-prize
TAKEDOWN
OK, so you found some exposure, now what? If you find exposed code, passwords or data, probably the fastest way to have it removed is, if the person is a present or past employee, just ask them to take it down. Otherwise, each repo site has its own method and process you’ll need to follow.
- For GitHub, below is their current process (although it could change once Microsoft acquires them[16]). Also, their DMCA (Digital Millennium Copyright Act) takedown is a bit different so make sure you know what you are dealing with before you make the request, you’ll just save yourself some headache.
- For GitLab, I think the following is their submission page:
- For PasteBin, their takedown process page is listed below:
- For StackOverflow, best takedown info I could fine is below:
- For Trello, both DMCA and regular takedown instructions seem to be posted at the following link:
- And…I’d like to think as a last resort only…you can always lawyer-up. One way is to have an attorney request the person who posted the stuff take it down, and the other way is to take legal action to protect your proprietary/copyrighted material.
FINAL THOUGHTS
After reading about cloud concerns, if you feel a bit overwhelmed or like you want to learn more, SANS offers a new Summit that is strictly around securing your cloud environment. I’m not sure if they plan to post the talks online, but they tend to do that for their summits, so if you can’t be there in person, check out their site…
SANS Cloud INsecurity Summit:
“Discover the 10 most damaging mistakes large cloud users unknowingly make and approaches to fixing them…”
What did I miss?
Can you think of anything I missed? If so, can you add it to the comments to help others improve their global security posture?
REFERENCES
[1] https://gizmodo.com/millions-of-time-warner-customer-records-exposed-in-thi-1798701579
[2] http://fortune.com/2018/04/12/uber-data-breach-security
[3] https://www.bbc.com/news/technology-42166004
[4] https://www.upguard.com/breaches/verizon-cloud-leak
[5] http://apollotarget.com/eg-html
[6] https://www.forbes.com/sites/davelewis/2017/11/27/do-it-yourself-data-breaches-with-s3-buckets
[9] https://zeltser.com/paste-sites-for-pen-testing-reconnaissance
[10] https://fossbytes.com/buckhacker-search-engine-hacker
[11] https://www.bbc.com/news/technology-43057681
[12] https://motherboard.vice.com/en_us/article/j5bgm3/buckhacke-amazon-server-search-engine-aws-security
[13] https://snyk.io/blog/leaked-credentials-in-packages
[14] http://www.robotstxt.org/robotstxt.html
[15] https://forensic4cast.com/forensic-4cast-awards/2018-awards
[19] https://www.engadget.com/2018/08/09/amazon-aws-error-exposes-31-000-godaddy-servers