For the moment, there are three different ways to feed AIL with data:
-
Be a collaborator of CIRCL and ask to access our feed. It will be sent to the static IP you are using for AIL.
-
You can setup pystemon and use the custom feeder provided by AIL (see below).
-
You can feed your own data using the ./bin/import_dir.py script.
AIL is an analysis tool, not a collector! However, if you want to collect some pastes and feed them to AIL, the procedure is described below. Nevertheless, moderate your queries!
Feed data to AIL:
-
Clone the pystemon's git repository
-
Install its python dependencies inside your virtual environment
-
Launch pystemon
./pystemon
-
Edit your configuration file
bin/packages/config.cfg
and modify the pystemonpath path accordingly -
Launch pystemon-feeder
./bin/feeder/pystemon-feeder.py
If you want to add a new processing or analysis module in AIL, follow these simple steps:
-
Add your module name in ./bin/packages/modules.cfg and subscribe to at least one module at minimum (Usually, Redis_Global).
-
Use ./bin/template.py as a sample module and create a new file in bin/ with the module name used in the modules.cfg configuration.
If you want to add a new webpage for a module in AIL, follow these simple steps:
-
Launch ./var/www/create_new_web_module.py and enter the name to use for your webpage (Usually, your newly created python module).
-
A template and flask skeleton has been created for your new webpage in ./var/www/modules/
-
Edit the created html files under the template folder as well as the Flask_* python script so that they fit your needs.
-
You can change the order of your module in the top navigation header in the file ./var/www/templates/header_base.html
-
You can ignore module, and so, not display them in the top navigation header by adding the module name in the file ./var/www/templates/ignored_modules.txt
Feel free to fork the code, play with it, make some patches or add additional analysis modules.
To contribute your module, feel free to pull your contribution.
You can do a lots of things easily with the ./bin/ModulesInformationV2 script:
- Monitor the health of other modules
- Monitor the ressources comsumption of other modules
- Start one or more modules
- Kill running modules
- Restart automatically stuck modules
- Show the paste currently processed by a module
You can navigate into the interface by using arrow keys. In order to perform an action on a selected module, you can either press or to show the dialog box.
To change list, you can press the key.
Also, you can quickly stop or start modules by clicking on the <K>
or <S>
symbol respectively. These are located in the Action column.
Finally, you can quit this program by pressing either <q>
or <C-c>
.
In AIL, you can track terms, set of terms and even regexes without creating a dedicated module. To do so, go to the tab Terms Frequency
in the web interface.
- You can track a term by simply putting it in the box.
- You can track a set of terms by simply putting terms in an array surrounded by the '' character. You can also set a custom threshold regarding the number of terms that must match to trigger the detection. For example, if you want to track the terms term1 and term2 at the same time, you can use the following rule:
\[term1, term2, [100]]\
- You can track regexes as easily as tracking a term. You just have to put your regex in the box surrounded by the '/' character. For example, if you want to track the regex matching all email address having the domain domain.net, you can use the following aggressive rule:
/*.domain.net/
.
In AIL, you can crawl Tor hidden services. Don't forget to review the proxy configuration of your Tor client and especially if you enabled the SOCKS5 proxy and binding on the appropriate IP address reachable via the dockers where Splash runs.
There are two types of installation. You can install a local or a remote Splash server.
(Splash host) = the server running the splash service
(AIL host) = the server running AIL
-
(Splash host) Launch
crawler_hidden_services_install.sh
to install all requirements (typey
if a localhost splah server is used or use the-y
option) -
(Splash host) To install and setup your tor proxy:
- Install the tor proxy:
sudo apt-get install tor -y
(Not required ifSplah host == AIL host
- The tor proxy is installed by default in AIL) - Allow Tor to bind to any interface or to the docker interface (by default binds to 127.0.0.1 only) in
/etc/tor/torrc
SOCKSPort 0.0.0.0:9050
orSOCKSPort 172.17.0.1:9050
- Add the following line
SOCKSPolicy accept 172.17.0.0/16
in/etc/tor/torrc
(for a linux docker, the localhost IP is 172.17.0.1; Should be adapted for other platform) - Restart the tor proxy:
sudo service tor restart
- Install the tor proxy:
-
(AIL host) Edit the
/bin/packages/config.cfg
file:- In the crawler section, set
activate_crawler
toTrue
- Change the IP address of Splash servers if needed (remote only)
- Set
splash_onion_port
according to your Splash servers port numbers that will be used. those ports numbers should be described as a single port (ex: 8050) or a port range (ex: 8050-8052 for 8050,8051,8052 ports).
- In the crawler section, set
- (Splash host) Launch all Splash servers with:
sudo ./bin/torcrawler/launch_splash_crawler.sh -f <config absolute_path> -p <port_start> -n <number_of_splash>
With<port_start>
and<number_of_splash>
matching those specified atsplash_onion_port
in the configuration file of point 3 (/bin/packages/config.cfg
)
All Splash dockers are launched inside the Docker_Splash
screen. You can use sudo screen -r Docker_Splash
to connect to the screen session and check all Splash servers status.
- (AIL host) launch all AIL crawler scripts using:
./bin/LAUNCH.sh -c
crawler_hidden_services_install.sh -y
- Add the following line in
SOCKSPolicy accept 172.17.0.0/16
in/etc/tor/torrc
sudo service tor restart
- set activate_crawler to True in
/bin/packages/config.cfg
sudo ./bin/torcrawler/launch_splash_crawler.sh -f $AIL_HOME/configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 1
If AIL framework is not started, it's required to start it before the crawler service:
./bin/LAUNCH.sh -l
Then starting the crawler service (if you follow the procedure above)
./bin/LAUNCH.sh -c