Laika BOSS + Bro = LaikaBro (?!)
Feb 18, 2017
UPDATE 02–19–2017: The laika-bro-client.py script referenced below is now on Github.
Over the past few nights I took some time to understand how Lockheed Martin’s Laika BOSS works in a networked environment and, after getting it setup in a virtualized network relatively quickly, was tempted to get it working with Bro. I’m surprised at the lack of information that describes how to get these two tools working together, so I thought I’d share my experience. (Props to the team at LMCO for providing a script that works with Redis and Suricata, and Emerson Electric for providing solid integration documentation for their File Scanning Framework and Bro.)
The development environment for this project is pretty simply: two Ubuntu (14.04.5) systems located on the same subnet via a private virtual network. Since this setup is only for development, the systems have a minimum number of cores (1) and the default amount of memory (1GB). The client system — which represents a Bro sensor — has Bro 2.5 installed and the server system — which represents a Laika cluster — has Laika installed and running as a service (laikad in asynchronous mode). Effectively, the server is waiting for something to send it files to process. But how can we do that?
From a high-level, the plan for this project is this:
- Use Bro to identify files in network traffic and extract them to a staging directory on the client system
- Use a script to monitor the staging directory and send files to the server
- Log the Laika output on the server
Extracting Files w/ Bro
To get files to the Laika server, we have to first get the files. Files are sent across the network all the time, so why not start there? Bro is a great choice for extracting files from network traffic because it can identify files across multiple application layer protocols (HTTP, SMTP, FTP, etc) and natively supports file extraction. To get the files we want Laika to process, we just need to have Bro extract them.
The script above does just that — Bro will extract every file it sniffs (so long as the file is under 10MB in size) and save it to the/tmp/monitor/ directory on the client (this is the staging directory that we’ll monitor separately). This script is effective for development, but also dangerously simple — even in small production environments, I would expect this script to extract hundreds-to-thousands of files each day. This is the first bottleneck in this whole process: if Bro is extracting too many files, then our downstream processes (file transfer, Laika scanning) will get uselessly backed up.
To make the script more effective, we should only be extracting files that Laika can either derive metadata from (e.g., PE files) or files that need to be scanned with Yara. The easiest way to do this is to use Bro’s file mime type identification (meta$mime_type) and only extract files that Laika can identify. Also, be sure to skip extracting very large files (f$seen_bytes) — you don’t want to be reading 1GB files in Python and trying to send that across your network.
Last thing: take note of the fname we’re using for each extracted file — we’re going to see that come back later. Here’s an example of what happens when we run Bro with the script (note that I’m using PCAP shared by Malware-Traffic-Analysis):
Running ‘extract-laika.bro’ on a PCAP file shared by Malware-Traffic-Analysis
Getting Files to Laika
Now that we have files, we just need to figure out how to get them to the Laika server. (Remember, the server is running asynchronously — it’s just waiting for files to arrive.) Solving this piece was the most complicated part of this project — it’s relatively easy to get the files to the server (thanks, Python!), but doing so efficiently is another story.
For this, I considered that in most networks Bro sensors would likely be sniffing and extracting lots of files — so we should expect to be sending large batches of files to the Laika server. Laika comes with a script that accounts for this (cloudscan.py), though it’s used in a different context (the server sends scan results back to the client because the client is expected to be a user waiting in a Terminal); for our use here, we just need to get the files to the server (no need for a response). cloudscan.py handles this with multi-processing (mp) and taking a similar approach here proved effective in my tests.
That said, that’s just one problem. Another is that we need to manage the script so that it executes on a loop and doesn’t execute multiple times. The common solution for this is to use a combination of cron and a script-managed pid / lock file, but I was looking for something more compact — and then I found schedule. Schedule provides similar functionality as cron, but makes it available directly in a Python script. I’m not sure that I would use this approach in a production environment, but for testing, it proved to be very useful. Here’s the bit of code where schedule is used — it runs in a while loop and kicks off a function (kick, which is used to get the mp file transfer going) according to the variable sched_time:
Main function runs the scheduler.
The biggest problem, though, is making sure that we send only one copy of every file extracted from Bro to the Laika server. This is difficult because, while it is very fast, the file transfer process can potentially take longer than the schedule is set. For example, if the schedule is set to kick off the file transfer process every 10 minutes, but the file transfer process actually takes 15 minutes to complete one run, then we’ll be running two simultaneous file transfer processes (each with their own subprocesses) at the same time — which is when the opportunity for duplicate file transfer occurs. I avoid this problem by immediately deleting the extracted file from the client after it is sent to the server, but this creates a different variation of the same problem: a file may be queued for transfer but the file might already be gone. I got around this issue in an inelegant way:
Function that verifies a file exists.
check = True
if not os.path.isfile(fname):
check = False
Function that defines the worker routine
for each subprocess. Try/except statement
exists to handle any files that are in the
queue but have been removed from the file
system. For dev purposes, each subprocess is
assigned a random worker number.
client = Client(broker, async=True)
randNum = randint(1,10000)
for fname in iter(input_queue.get, None):
print 'Worker %s sending new object' % randNum
file_buffer = open(fname, 'rb').read()
externalObject = ExternalObject(buffer=file_buffer, externalVars=ExternalVars(filename=fname, source='bro', extMetaData=fname_to_json(fname)), level='level_minimal')
Every file that is processed for transfer runs through the sanity_check function to make sure it still exists (if it doesn’t, then the file is popped out of the queue and nothing happens); there’s also a try/except inside the file transfer function to catch any file that may have somehow passed sanity_check. However, I only ran into this issue when I went looking for it: using a PCAP from the 2012 MACCDC as a test, I extracted 3030 files with Bro to /tmp/monitor/ and set the script to kick off file transfers every second. This had the expected effect (the transfer function kicked off multiple times before previous file transfers could finish and tried to transfer files that no longer existed), but is so far out of the norm that under normal use, this shouldn’t happen.
(It’s also worth noting that it took my client only a few seconds to send those 3030 files. After 20 minutes I checked the Laika server and verified it received and processed them all. I was really happy with these results, but recognize that they don’t represent a production environment.)
Here’s the full prototype script:
Another thing worth mentioning is that this script doesn’t protect you from shooting yourself in the foot: it always deletes files after it tries to send them (even if the file transfer fails!), it doesn’t have file size restrictions, etc. These are things I’ll likely go back and add later.
Here’s what this actually looks like running on the VM systems. The VM on the left is the server (running the laikad service in the bottom Terminal) and the VM on the right is the client (running laika-bro-client.py in the bottom Terminal). The images are before and after shots of what happens when Bro is executed on the client and files are sent to the server.