What Is PCPA
Packet Capture or PCAP (also known as libpcap) is an application programming interface (API) that captures live network packet data from OSI model Layers 2-7. Network analyzers like Wireshark create .pcap files to collect and record packet data from a network. PCAP comes in a range of formats including Libpcap, WinPcap, and PCAPng.
Why Do We Need To Use PCPA
PCAP is a valuable resource for file analysis and to monitor your network traffic. Packet collection tools like Wireshark allow you to collect network traffic and translate it into a format that’s human-readable. There are many reasons why PCAP is used to monitor networks. Some of the most common include monitoring bandwidth usage, identifying rogue DHCP servers, detecting malware, DNS resolution, and incident response.
What Is Face Detection
Face detection is a computer technology being used in a variety of applications that identifies human faces in digital images. Face detection also refers to the psychological process by which humans locate and attend to faces in a visual scene.
Wireshark and other tools like Network Miner are great for interactively exploring packet capture files, but there will be times where you want to slice and dice PCAPs using Python and Scapy. Some great use cases are generating fuzzing test cases based on captured network traffic or even something as simple as replaying traffic that you have previously captured.
We are going to take a slightly different spin on this and attempt to carve out image files from HTTP traffic. With these image files in hand, we will use OpenCV,2 a computer vision tool, to attempt to detect images that contain human faces so that we can narrow down images that might be interesting. We can use our previous ARP poisoning script to generate the PCAP files or you could extend the ARP poisoning sniffer to do on-the- fly facial detection of images while the target is browsing. Let’s get started by dropping in the code necessary to perform the PCAP analysis. Open pic_carver.py and enter the following code:
import re import zlib import cv2 from scapy.all import * pictures_directory = "/home/justin/pic_carver/pictures" faces_directory = "/home/justin/pic_carver/faces" pcap_file = "bhp.pcap" def http_assembler(pcap_file): carved_images = 0 faces_detected = 0 a = rdpcap(pcap_file)
This is the main skeleton logic of our entire script, and we will add in the supporting functions shortly. To start, we open the PCAP file for processing.
sessions = a.sessions() for session in sessions: http_payload = "" for packet in sessions[session]: try: if packet[TCP].dport == 80 or packet[TCP].sport == 80:
We take advantage of a beautiful feature of Scapy to automatically separate each TCP session into a dictionary.
# reassemble the stream http_payload += str(packet[TCP].payload) except: pass
We use that and filter out only HTTP traffic, and then concatenate the payload of all of the HTTP traffic into a single buffer. This is effectively the same as right- clicking in Wireshark and selecting Follow TCP Stream.
headers = get_http_headers(http_payload) if headers is None: continue
After we have the HTTP data reassembled, we pass it off to our HTTP header parsing function, which will allow us to inspect the HTTP headers individually.
image,image_type = extract_image(headers,http_payload) if image is not None and image_type is not None:
After we validate that we are receiving an image back in an HTTP response, we extract the raw image and return the image type and the binary body of the image itself. This is not a bulletproof image extraction routine, but as you’ll see, it works amazingly well.
# store the image file_name = "%s-pic_carver_%d.%s" %(pcap_file,carved_images,image_type) fd = open("%s/%s" % (pictures_directory,file_name),"wb") fd.write(image) fd.close() carved_images += 1
Here we store the extracted image.
# now attempt face detection try: result = face_detect("%s/%s" % (pictures_directory,file_name),file_name) if result is True: faces_detected += 1 except: pass return carved_images, faces_detected carved_images, faces_detected = http_assembler(pcap_file) print "Extracted: %d images" % carved_images print "Detected: %d faces" % faces_detected
Here we will pass the file path along to our facial detection routine.
Now let’s create the supporting functions by adding the following code above our http_assembler function.
def get_http_headers(http_payload): try: # split the headers off if it is HTTP traffic headers_raw = http_payload[:http_payload.index("\r\n\r\n")+2] # break out the headers headers = dict(re.findall(r"(?P<name>.*?): (?P<value>.*?)\r\n", headers_raw)) except: return None if "Content-Type" not in headers: return None return headers def extract_image(headers,http_payload): image = None image_type = None try: if "image" in headers['Content-Type']: # grab the image type and image body image_type = headers['Content-Type'].split("/") image = http_payload[http_payload.index("\r\n\r\n")+4:] # if we detect compression decompress the image try: if "Content-Encoding" in headers.keys(): if headers['Content-Encoding'] == "gzip": image = zlib.decompress(image, 16+zlib.MAX_WBITS) elif headers['Content-Encoding'] == "deflate": image = zlib.decompress(image) except: pass except: return None,None return image,image_type
These supporting functions help us to take a closer look at the HTTP data that we retrieved from our PCAP file. The get_http_headers function takes the raw HTTP traffic and splits out the headers using a regular expression. The extract_image function takes the HTTP headers and deter- mines whether we received an image in the HTTP response. If we detect that the Content-Type header does indeed contain the image MIME type, we split out the type of image; and if there is compression applied to the image in transit, we attempt to decompress it before returning the image type and the raw image buffer.
Now let’s drop in our facial detection code to determine if there is a human face in any of the images that we retrieved. Add the following code to pic_carver.py:
def face_detect(path,file_name): img = cv2.imread(path) cascade = cv2.CascadeClassifier("haarcascade_frontalface_alt.xml") rects = cascade.detectMultiScale(img, 1.3, 4, cv2.cv.CV_HAAR_SCALE_IMAGE, (20,20)) if len(rects) == 0: return False rects[:, 2:] += rects[:, :2]
Using the OpenCV Python bindings, we can read in the image and then apply a classifier that is trained in advance for detecting faces in a front facing orientation. There are classifiers for profile (sideways) face detection, hands, fruit, and a whole host of other objects that you can try out for yourself. After the detection has been run, it will return rectangle coordinates that correspond to where the face was detected in the image.
# highlight the faces in the image for x1,y1,x2,y2 in rects: cv2.rectangle(img,(x1,y1),(x2,y2),(127,255,0),2) cv2.imwrite("%s/%s-%s" % (faces_directory,pcap_file,file_name),img) return True
We then draw an actual green rectangle over that area and write out the resulting image.
Now let’s take this all for a spin inside your Kali VM.
Let’s Check Our Code
If you haven’t first installed the OpenCV libraries, run the following commands from a terminal in your Kali VM:
#:> apt-get install python-opencv python-numpy python-scipy
This should install all of the necessary files needed to handle facial detection on our resulting images. We also need to grab the facial detection training file like so:
Now create a couple of directories for our output, drop in a PCAP, and run the script. This should look something like this:
#:> mkdir pictures #:> mkdir faces #:> python pic_carver.py Extracted: 189 images Detected: 32 faces #:>
to the fact that some of the images we fed into it may be corrupt or partially downloaded or their format might not be supported. (I’ll leave building a robust image extraction and validation routine as a homework assignment for you.) If you crack open your faces directory, you should see a number of files with faces and magic green boxes drawn around them.
This technique can be used to determine what types of content your target is looking at, as well as to discover likely approaches via social engineering. You can of course extend this example beyond using it against carved images from PCAPs and use it in conjunction with web crawling.