Project Build | PenguinPro.ca

Real Project Build

Hey again, so as some of you may be aware that my first project selection and build attempt did not exactly take off like I expected it to, but that is totally okay. “For my next trick” as you might be thinking, I am going to discuss the build process of another software project. Now although my first software choice may not have been ideal, I would like to remind you that it actually was just “not ideal for the situation.” Meaning that for the class I am taking, that software choice would present too many roadblocks which both hinder learning and potentially being able to meet the deadlines. I am hopeful however to continue on with it in the future but for now I will present my new chosen software: ClamAV.

ClamAV required about a 5 minute build time and maybe a 20 minute setup; and I was ready to start profiling the software for optimization possibilities within an hour easily. Compared to my previous choice which took more than 10X as long to build and about 30X as long to configure/setup (and is still not complete.) Now before I start writing this article I will provide the link to my previous build attempt for context: https://penguinpro.ca/2019/11/05/software-build/

Now pending any further catastrophes; let’s start our ClamAV build with some reasoning:

Why ClamAV:

It more or less found me, for some reason I was able to take a moment to check my twitter when I noticed that a well-known figure in the tech community posted about a recent POC that was released online: https://twitter.com/hackerfantastic/status/1190685521153937408

Further analysis led me to their website and eventually to the CalmAV github source repo: https://github.com/Cisco-Talos/clamav-devel

Here I noticed that within their repository read me there was a section labeled “Want to make a contribution” follow by a “thank you,” which is always a good sign right? From there I looked on their website to find that their documentation in regards to build instructions was really well done, and even included some profiling/benchmarking instructions.

You can check it out here: https://www.clamav.net/documents/clamav-development

With all that, (and the fact I was out 1 project.) There was only really one option, clone the source and attempt the build:

Git clone https://github.com/Cisco-Talos/clamav-devel

From there it was a simple matter of creating a build directory and running:

CFLAGS=”-ggdb -O0″ ./configure –prefix=`pwd`/installed –enable-debug –enable-check –enable-coverage –with-systemdsystemunitdir=no –enable-experimental –enable-clamdtop –enable-xml –enable-pcre –disable-llvm

Which again was well documented, even detailing the reasoning of each flag, also I created another directory and followed the instructions again. However this time including –pg for gprof within the CFLAGS line, but more on that in later posts.

Building:

Next it was a simple matter of running:

make –j <number of cores>

make install

Which the time command reported: real 1m2.708s

User 2m12.359s

Sys 0m20.945s

Configuration:

After this to make the program run we just needed to create a few directories and copy some configuration files from their “example” state to the “working” state. For this build I am going to use a general configuration, meaning that I am going to leave a lot of the defaults as they are set.

This is done in the etc directory of the directory that you built your software into. Here there are two files, one is clamd.conf.sample the other is freshclam.conf.sample.

We need to open both of these files with a text editor then modify the line near the top (line 7 in my case) by commenting out the word “Example.” To do this just start the line off with a ‘#’ symbol.

The other lines we need to modify are LogFile (line 14 currently) and set the log directory. It may be a good idea to put it in the build directory, in my case the line looked like:

LogFile /home/builder/clamav/var/log/clamd.log

As well as the line labeled:

Extended detectionInfo yes

And finally:

LocalSocket /tmp/clamd.socket

Now save this file with the filename: clamd.conf in the same directory etc. (that’s local build etc, not /etc)

Our next step is to modify the config file: freshclam.conf.sample in a similar way.

First, just like above; comment out the line:

#Example

Then find, and uncomment the line:

#Debug yes

Similarly the file needs to be saved as: freshclam.conf again in the same etc directory.

Creating Directories:

Our next step is to create the directories that we defined in the above configuration files. In the main build directories we can run the command:

Mkdir –p var/log

That is the directory we told clamd.conf to save log information to; and although we are not using this setup for anything real, generating log files can generally be helpful/realistic.

Next we need another directory for our AV database:

Mkdir share/clamav

Then one last directory that we can put some files into to run our scan against:

mkdir ~/clamav_test

Note that our build has provided us with some benign files that should trip clamscan when run against, they are in our build directory and can be copied over with:

Cp –vi clam-devel/test/* ~/clamav_test

The reason we copy the samples over instead of running a scan against this directory itself is that we can add our own files to extend out test without modifying the contents of the original build directory.

First run:

First we need to get the signatures to compare against, to do this we need to run from our local build bin directory:

./freshclam

You can notice here, that since we enabled debug information we get some output from the run that may be of some interest to us. Next go into the local sbin directory and run:

./clamd

To start the clamav daemon and then we are ready to start playing with the tools in the local bin directory, most notably “clamscan” which we will now run against our test directory with:

./clamscan ~/test_dir

Output:

Our first run seemed to be a success, we have output stating that 52 files where scanned, with 48 detections and total run time report at:

Real 2m31.918s

User 2m28.895s

Sys 0m1.242s

Noticeably this process spun 100% on a single core and a large memory pool in virtual, resident and shared before any output was produced. Interestingly this program also outputs its own run time in the final output which will be helpful when it comes to benchmarking. In this run it was reporting 151.634 seconds (2m31s) which is directly in line with what was reported by the time command.

Summery:

Overall I believe that I now have an appropriate software selection for my course project. One that does not take long to build (in comparison to my previous software selection,) and also has potential for optimization due to the type of work it does in terms of parsing and hashing. The added bonus of excellent documentation and all being written in a language I personally really love makes me feel like I was able to find a good candidate after all.

Hopefully this information can be helpful to you and I hope to have my next post up soon as I am anxiously excited to start digging deeper into this technology.

Love as always,

ElliePenguins