Skip to content

Commit

Permalink
Merge pull request #17 from grantat/local-quote
Browse files Browse the repository at this point in the history
Adds quoting to local version. Updates Readme with examples for disabling module and docker with environment variables. Related to #15.
  • Loading branch information
grantat authored Oct 30, 2017
2 parents f74dd8b + 902a252 commit be457b8
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 3 deletions.
30 changes: 28 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,36 @@ To run it as a local script:
$ ./main.py -l search {URI-R}
```

The backlinks calculation is costly to your computers, so it is recommended to turn it off:
### Environment variables

Carbon Date provides the option of passing in environment variables for both the Bing and Bitly services.
For Bitly, the environment key is `CD_Bitly_token`. For Bing, the environment key is `CD_Bing_key`.
Environment variables can be passed into docker using the `-e` or `--env` arguments before executing the Carbon Date application like so:

```
$ docker run -e "CD_Bitly_token=foo" -e "CD_Bing_key=bar" -it --rm oduwsdl/carbondate ./main.py -l search http://www.cs.odu.edu/
```

### Disabling modules

CarbonDate provides the option of preventing searching for specified modules in the local version.
For example, if a user wants to disable backlinks and google modules the user can add the `-e` argument after a URI-R is specified like so:

```
./main.py -l search "https://www.cs.odu.edu/" -e cdGetBacklinks cdGetGoogle
```

A complete list of all the modules a user can disable is as follows:

```
$ ./main.py -l search {URI-R} -e cdGetBacklinks
cdGetPubdate
cdGetArchives
cdGetBing
cdGetBitly
cdGetTwitter
cdGetBacklinks
cdGetGoogle
cdGetLastModified
```

## How to add your module
Expand Down
6 changes: 5 additions & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import argparse
import os
import sys
import shlex

logo = ('''
_____ ___ _____ _____ _____ __ _ _____ ___ _____ _____
Expand Down Expand Up @@ -36,6 +37,9 @@ def parserinit():
if args.server:
os.system('./server.py')
elif args.local:
os.system('./local.py %s' % ' '.join(sys.argv[2:]))
arg_str = ""
for a in sys.argv[2:]:
arg_str += (' ' + shlex.quote(a))
os.system('./local.py%s' % (arg_str))
elif args.lh:
os.system('./local.py -h')
2 changes: 2 additions & 0 deletions modules/cdGetGoogle.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from .cdGetLowest import getLowest, validateDate
from random import randint
import logging
import urllib.parse

moduleTag = 'google.com'

Expand Down Expand Up @@ -135,6 +136,7 @@ def getGoogle(url, outputArray, indexOfOutputArray, verbose=False, **kwargs):

# Caution google blocks bots which do not play nice
# return ''
url = urllib.parse.quote(url, safe='')
query = ('https://www.google.com/search?hl=en&tbo=d&tbs=qdr:y15'
'&q=inurl:' + url + '&oq=inurl:' + url)

Expand Down

0 comments on commit be457b8

Please sign in to comment.