Learn Python – Python Forensics and Virtualization | Hash Functions- Basic and advance

In this tutorial, we will study the Forensics science the usage of Python, primary Python forensics applications, Hash functions, Cracking an Encryption, Visualization, Naming Conventions, Dshell and Scapy, Network Forensics with its specific explanation.


Collecting and maintaining proof is most critical for cyber forensic investigation and analysis at the pc devices. It performs important function in a court room to be used towards the criminal. Nowadays, technological know-how helps us to get the information through just typing the question on the browser. But it additionally invitations the cyber crooks. Cyber crooks are those who function the malicious exercise by means of the use of their gadget and internet. They can get your all information from sitting somewhere else.

With its vast applications, Python additionally provides the facility to work with the digital forensics. By using it, we can gather data, extract evidence, and also encrypt password. It will aid us to reinstate the reliability of evidence.

Before go further, you must familiar with the Python and its enhance concepts.

Introduction to Computational Forensics

Computational Forensics is a section of learn about which used to solve problems in quite a number forensics disciplines. It makes use of computer-based modeling, analysis, computer simulation and recognition. Python Forensics was once invented via the Chet Homster. There are also sample evidence, such as fingerprints, shoeprints, toolmarks and any documents. It makes use of procedures, scope of objects, and substances. There are also physiological and behavioral patterns such as digital evidence, DNA, and crime scenes.

We can also use the a variety of algorithms to deal with the signal and image processing. By using algorithms, we can additionally take care of the, data mining, pc graphics, computing device learning, computer imaginative and prescient statistics visualization, and statistical sample recognition.

In a few words, the computation forensics is used to study the digital evidence, computational forensics deals with the a variety of kinds of evidence.

Naming Conventions for Python Forensics Application

We ought to acquainted with the naming conference and patterns to comply with the Python Forensics guidelines. Consider the following table.

Naming Convention Example
Local variables camelCase with optional underscore studentName
Constant Uppercase, words separated by underscores STUDENT_NAME
Global variable Prefix with camelCase with optional underscores my_studentName
Function PascalCase with optional underscores; active voice MystudentName
Module Prefix with the camel case _studentname
Class Prefix class with Pascalcase; keep it sort class_MyStudentName
Object Prefix ob_with camelcase ob_studentName

The hashing algorithm is one of the first-rate methods of take as an input a circulation of binary data. In the real life scenario, we can encrypt our password, file, or even any types of digital file or data. The algorithm takes an enter and generates the encrypted message. Let’s see the given example.


import sys,string,md5  
print("Enter the name")  

Python Hash Function

Python hash function is used to map a massive amount of records to a constant value. An enter returns the equal output. It is a hash sum and shops features with specific information. Once we map the records to a constant value, that can’t be revert. That’s why we also refer it as one-way cryptographic algorithm.

Let’s understand the following example –

Example –

import hashlib  
import uuid  
def hash_pass(password):  
    s = uuid.uuid4().hex  
    return hashlib.sha256(s.encode() + password.encode()).hexdigest() + ':' + s  
def verify_password(hashed_password, user_password):  
    password, s = hashed_password.split(':')  
    return password == hashlib.sha256(s.encode() + user_password.encode()).hexdigest()  
new_password = input('Enter your password :')  
hashed_password = hash_pass(new_password)  
print('The hash string to store in the db is: ' + hashed_password)  


Enter your password: sharma
The hash string to store in the db is: 947782bdb0c7a5ad642f1f26179b6aef2d9857427b45a09af4fce3b8f1346e91:8a8371941513482487e5ab8af2ae6466

Now, we will re-enter the password.

old_password = input('Enter new password ')  
if verify_password(hashed_password, old_password):  
    print(' Entered password is correct')  
    print('Passwords do not match')  


Enter your password devansh 
The hash string to store in the db is: 4762866edd3b49c7736163ef3d981e42629a09a9ca7e081f56d116e137d77b9c:ebbf5b16bd9f4b989505a495bf7ae9b9
Enter new password sharma
Passwords do not match

The hash function has the following properties.

We can simply transform any hash value for any input value.

It doesn’t able to produce the same output as given hash value.

It is unrealistic to transform the input without moving the hash value.

Cracking an Encryption in Python

We have to understand how to encrypt the textual content records that we fetch for the duration of evaluation and evidence. First, understand the primary cryptographic.

Generally, secret messages are sent by means of the military man or woman to bring their plans barring get study with the aid of their enemies. These messages are now not in the human-readable format. The plain texts are encrypted by the usage of the encryption algorithm and these texts are referred to as cipher text.

Suppose a well-known commander sends a message to senior to save the textual content from their enemies. Here, we take shift the undeniable textual content letter four area in the alphabet. Now, the A will be E, each B is F and so no.

Let’s recognize the following instance to crack the vector data.

Example –

import sys  
def decryption(text,cipher):  
    for each in cipher:  
        x = (ord(each)-text) % 126  
        if x < 32:  
            simple_text += chr(x)  
cipher_text = input('Enter the message: ')  
for i in range(1,95,1):  


Enter message: Yes


A virtualization is an act of emulate IT system such as workstations, networks and storage. We make the digital instance of such a resource. It can be executed with the assist of hypervisor.

The virtualization of hardware performs very necessary position in the computer forensics. By the use of the virtualization, we can get following advantages.

We can use the workstation in a validate state for each investigation.

We can recover deleted data by including dd images of a drive on a virtual machine.

The virtual machine can turn into the recovery device that will help to gather evidence.

We outline the following steps to create digital computer using Python

Step – 1: Suppose we think about our nearby laptop as “dummy”. Each Virtual Machine will have at least 512 MB of memory.

virmach_memory = 512 * 1024 * 1024  

Step – 2: Now, we connect this virtual machine to the default cluster.

virmach_cluster = api.clusters.get(name = "Default")  

Step – 3: Next, boot the virtual computing device from the digital HDD.

vm_os = params.OperatingSystem(boot = [params.Boot(dev = "hd")])  

Now, we will mix the above steps into a digital desktop parameter object. Let’s apprehend the following example.

Example –

from ovirtsdk.xml import params  
from ovirtsdk.api import API    
# We need to provide Api credentials for virtual machine  
    api = API(url="https://HOST",  
    virmach_name = "dummy"  
    virmach_memory = 512 * 1024 * 1024  # calculating the memory in bytes  
    virmach_cluster = api.clusters.get(name="Default")  
    virmach_template = api.templates.get(name="Blank")  
    # here we are assigning the parameters to operating system  
    virmach_os = params.OperatingSystem(boot=[params.Boot(dev="hd")])  
    virmach_params = params.VM(name=virmach_name,  
                           os = virmach_os)  
        print("Virtual machine '%s' added successfully." % virmach_name)   
    except Exception as ex:  
        print("Adding virtual machine '%s' failed: %s" % (virmach_name, ex))  
except Exception as ex:  


Virtual Machine dummy added successfully.

Network Forensics in Python

Python also grant the facility to work with the community forensics. In the cutting-edge days, Python community forensics environment investing can come throughout many difficulties. These problems can be responding to a breach report, executing assessments pertaining to susceptibility, or validating regularity compliances. Let’s apprehend the simple terminology of community forensics.

Client – The patron runs private laptop and workstation.

Server – The server executes the client’s request.

Protocols – Protocols are the set of rule that have to be followed whilst records transfer.

Websockets – A websockets are protocol that affords the full-duplex communication and runs over the TCP connection. We can send the bi-direction messages the use of the websockets.

With the help of those protocols, we can authenticate the information and sent or received by using the 1/3 party users. But, encryption is imperative to secure channels.

Let’s understand the following example of network

Example –

import socket  
# creating a socket object  
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  
# getting local machine name  
host = socket.gethostname()  
port = 8080  
# connection to hostname on the port.  
sock.connect((host, port))  
# Receive no more than 1024 bytes  
temp = sock.recv(1024)  
print("The client waits for connection")  


The client waits for connection

Python Scapy and Dshell

Let’s understand the quick introduction Python Scapy and Dshell.

Python Scapy

A scapy is Python-based tool which analyze and manipulate network traffic. With the assist of scapy, we can analyze packet manipulation. We can additionally seize and decode the packets of a huge range of protocols. The benefit of using scapy is to get the unique record about network site visitors to the investigator. The third-party equipment such as OS fingerprint app can be additionally used in Scapy. Let’s apprehend the following example.

Example –

#Imports scapy and GeoIP toolkit  
import scapy, GeoIP   
from scapy import *  
geoIp = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE)   
def locatePackage(pkg):  
    # extracts the source IP address  
    source = pkg.getlayer(IP).src  
    # extracts the destination IP address  
    destination = pkg.getlayer(IP).dst  
    # gets Country details of source  
    srcCountry = geoIp.country_code_by_addr(source)   
    dstCountry = geoIp.country_code_by_addr(destination)  
    # gets country details of destination  
    print src+"("+sourceCountry+") >> "+destination+"("+destinationcountry+")\n"  


source INDIA >> destination USA

Python Dshell

The Dshell is a Python-based network forensics analysis toolkit. It was developed through the US navy research laboratory and released it open-source in 2014. It makes the forensics investigation very easy. Dshell gives the following decoders.

reservedips – It is used to identify solutions for the DNS problems.

rip-http – It extracts the files from HTTP traffic.

large-flows – It is a decoder that represents the list net flows.

Protocols – It identifies the non-standard protocols.

dns – It extracts DNS-related queries.

Python Searching

Searching is the most important section of the forensics investigation. Nowadays, the proper search is upon the investigator who is walking the evidence. Keyword looking out from the message is a pillar of the investigation. We can discover the robust evidence with the assist of a keyword.

The trip and understanding both are required to get the statistics from the deleted messages.

Python provides the more than a few built-in modules to aid search operation. The investigator can locate the result the usage of the key phrases such as “who”, “what”, “where”, “when”, “which”, etc. Let’s apprehend the following example.

Example –

# Searching a particular word from a message  
str1 = "This is an example for Computational forensics of gathering evidence!"  
str2 = "string"  
print(str1.find(str2, 10))  
print(str1.find(str2, 40))  



Python Indexing

Indexing is feature that the investigator can use to accumulate practicable evidence from the files. The evidence can be restricted inside the reminiscence snapshot, a disk image, a file, or a community trace. It is very beneficial to reduce time for time-consuming tasks like keyword searching. The indexing also used to stumble on the key phrases in interactive searching phase. In the following example, we have defined indexing in Python.

Example –

list1 = [123, 'example', 'creative', 'indexing']  
print("Index example : ", list1.index('example'))  
print("Index for indexing : ", list1.index('indexing'))  
str1 = "This is a message for forensic investigation indexing"  
str2 = "message"  
print("Index of the character keyword found is ")  


Index example :  1
Index for indexing :  3
Index of the character keyword found is 

Python Image Library

The real meaning of forensics investigation is to extract the precious records from the accessible resources. Getting all the relevant records from the aid is indispensable for the report. It helps us to derive terrific result.

Resource statistics can be both simple information shape such as databases or complex records buildings such as JPEG image.

Investigator can easily get right of entry to the records from the simple facts structure but extracting data from the complex data structure is tedious task.

Python gives the Image library which is recognized as PIL. It is used to add image processing skills to out Python interpreter. It also help the file formats, graphics capabilities and also presents effective photograph processing. Let’s apprehend the following photo to extracting data from images.

We define the programming instance to give an explanation for how it really works.

Step – 1: Suppose we have a following picture the place we want to extract the details.

Step – 2: An image consists of more than a few pixel values. The PIL library makes use of to extract the photograph details for collect evidence. Let’s apprehend the following example.

Example –

from PIL import Image  
im = Image.open('penguin.jpeg', 'r')  
pix_val = list(im.getdata())  
pix_val_flat = [x for sets in pix_val for x in sets]  




The output is returned in the shape of a list. It is a pixel price of the RGB aggregate that gives a higher photograph of what records is needed.

Python Multiprocessing Support

Forensics specialists find difficulties to follow digital solutions to large digital evidence on the frequent crime. Most of the digital evidences are the single threaded that suggest we can execute only one command at time. Let’s see the brief introduction of multiprocessing.


Multiprocessing is an capability of the device that guide more than one process. It allows the a number of packages to run concurrently. There are two types of the multiprocessing – symmetric and uneven processing.

Let’s understand the following example of multiprocessing.

Example –

import random  
import multiprocessing  
def list_append(count, id, out_list):  
    # count number of process at a time  
    for i in range(count):  
    if __name__ == "__main__":  
        size = 810  
        procs = 2  
        jobs = []  
    for i in range(0, procs):  
        out_list = list()  # list of processes  
        process1 = multiprocessing.Process(  
            target=list_append, args=(size, i, out_list))  
        # appends the list of processes  
    # Calculate the random number of processes  
    for j in jobs:  
        j.start()  # initiate the process  
    # After the processes have finished execution  
    for j in jobs:  
        print("List processing complete.")  


List processing complete

Mobile Forensics in Python

Forensics investing is no longer only confined to the trendy laptop hardware such as hard disk, CPUs, etc. Hardware is observed with the assist of strategies to analyze non-standard hardware or transient evidence.

Nowadays, smartphones are broadly used in digital investigation, but they nonetheless supposed as non-standard. With the suitable lookup of smartphones, we can extract photos, smartphones, and messages.

The android smartphones uses the PIN, or alphanumeric password. The password can be between four and 16 digits/characters.

In the following example, we will get through a lock display to extract data. The smartphone password typically stores inside a file password.key in /data/system.

Android stores a salted SHA1-hashsum and MD5-hashsum of this password. Let’s see the following example.

Example –

public byte[] passwordToHash(String password) {  
  if (password == null) {  
     return null;  
  String algo = null;  
  byte[] hashed = null;  
  try {  
     byte[] saltedPassword = (password + getSalt()).getBytes();  
     byte[] sha1 = MessageDigest.getInstance(algo = "SHA-1").digest(saltedPassword);  
     byte[] md5 = MessageDigest.getInstance(algo = "MD5").digest(saltedPassword);  
     hashed = (toHex(sha1) + toHex(md5)).getBytes();  
  } catch (NoSuchAlgorithmException e) {  
     Log.w(TAG, "Failed to encode string because of missing algorithm: " + algo);  
  return hashed;  

The above code is a sample code of crack smartphone password. The dictionary attack may not be affected to crack the password because hashed password is stored in a salt file. The salt file is a string of hexadecimal illustration of a random integer of 64 bit. The Rooted smartphones or JTAG Adapter can get admission to the salt file.

Rooted Smartphones

The file’s dump /data/system/password.key is saved in SQLite database beneath the lock screen.password_salt. The Password is saved under settings.db.

JTAG Adapter

The JTAG stands for Joint Test Action Group which can be used to access the salt. Similarly, a Riff-Box or a JIG-Adapter can be used to get right of entry to the sale files. We can locate the role of the encrypted data using the bought information from Riff-box. The policies are given below.

Find the associated string “password_salt”.

The width of the salt file represents in the bytes. This is its length.

This is the length which is actually searched to get the stored password/pin of the smartphones.

Memory and Forensics

Python forensics specifically focuses on the volatile reminiscence with the assist of Volatility which is a Python primarily based framework.

Volatile Memory

Volatile reminiscence is a kind of reminiscence that erased when the system’s power is grew to become off or interrupted. In the simple words, if we are working on a report that has now not been saved to the tough disk and the strength goes off, we will lose our data.

The volatile memory follows the identical sample as the different forensics investigations.

First, it needs to be selected the target of the investing.

Acquire the forensics data.

Forensics Analysis

The RAM dump is device which used to analysis the gathered facts from the RAM.

YARA Rules

YARA is a tool which used to examine the suspected files/ directories and fit strings. It is primarily based on the pattern matching implementation. It plays an essential role in forensics analysis.

Example –

import operator  
import os  
import sys  
sys.path.insert(0, os.getcwd())  
import plyara.interp as interp  
# Plyara is a script that lexes and parses a file consisting of one more Yara  
if __name__ == '__main__':  
    file_to_analyze = sys.argv[1]  
Dictrules = interp.parseString(open(file_to_analyze).read())  
authors = {}  
imps = {}  
meta_keys = {}  
max_strings = []  
max_string_len = 0  
tags = {}  
rule_count = 0  
for rule in Dictrules:  
    rule_count += 1  
# Imports   
if 'imports' in rule:  
    for imp in rule['imports']:  
        imp = imp.replace('"', '')  
if imp in imps:  
    imps[imp] += 1  
    imps[imp] = 1  
# Tags   
if 'tags' in rule:  
    for tag in rule['tags']:  
        if tag in tags:  
           tags[tag] += 1  
           tags[tag] = 1  
# Metadata   
if 'metadata' in rule:  
    for key in rule['metadata']:  
       if key in meta_keys:  
         meta_keys[key] += 1  
         meta_keys[key] = 1  
if key in ['Author', 'author']:  
    if rule['metadata'][key] in authors:  
         authors[rule['metadata'][key]] += 1  
authors[rule['metadata'][key]] = 1  
# Strings   
if 'strings' in rule:  
    for strr in rule['strings']:  
if len(strr['value']) > max_string_len:  
    max_string_len = len(strr['value'])  
max_strings = [(rule['rule_name'], strr['name'], strr['value'])]  
elif len(strr['value']) == max_string_len:  
max_strings.append((rule['rule_name'], strr['key'], strr['value']))  
print("\nThe number of rules implemented" + str(rule_count))  
ordered_meta_keys = sorted(meta_keys.items(), key=operator.itemgetter(1),  
reverse = True)  
ordered_authors = sorted(authors.items(), key=operator.itemgetter(1),  
reverse = True)  
ordered_imps = sorted(imps.items(), key=operator.itemgetter(1), reverse=True)  
ordered_tags = sorted(tags.items(), key=operator.itemgetter(1), reverse=True)