Using dblp.org, search titles and its BibTex info.
conda create -n titles2bibtex python=3.8 -y
conda activate titles2bibtex
pip install lxml==4.6.3 requests==2.26.0 beautifulsoup4==4.9.3 pandas==1.3.1 tqdm==4.61.2 matplotlib==3.4.2
If you want to find the titles with empirical
in the papers published in the recent 100
issues of the journal IEEE_Transactions_on_Neural_Networks_and_Learning_Systems
,
python search_papers_with_keywords_in_the_title.py -search 'IEEE_Transactions_on_Neural_Networks_and_Learning_Systems' -key empirical -max 100 --output_file_path
More info about its usage can be seen by python search_papers_with_keywords_in_the_title.py --help
5 journals ("IEEE Transactions on Software Engineering" "ACM Transactions on Software Engineering and Methodology" "ACM Transactions on Software Engineering and Methodology" "Automated Software Engineering" "Empirical Software Engineering" "IEEE Transactions on Neural Networks and Learning Systems"O) Keywords: empirical code commit diff change generation experimental commits diffs changes language multi multi-language multilingual
bash search_papers_in_journals.sh
The log file is saved in log/search_papers_in_journals.sh.log
In the coorperation of papers' writting, multiple people may quote the same reference, but different people may have different BibTex information derived from the same paper, which will result in two citations in a paper. In order to unify BibTeX information, based on dblp, a powerful on-line reference for bibliographic information on major computer science publications, we can basically export consistent BibTex information of papers by searching their titles.
What we do can also be regarded as the conversion from papers' titles
to their uniform BibTeX s
.
A .csv
file including many papers' titles which can be exported from reference mangemant app such as Zotero.
For example: papers_titles.csv
(there must be a column with the title named Title
in that file)
Title | |||
---|---|---|---|
An investigation of cross-project learning in online just-in-time software defect prediction | |||
ARDiff: scaling program equivalence checking via iterative abstraction and refinement of common code | |||
CoRA: decomposing and describing tangled code changes for reviewer | |||
Boosting neural commit message generation with code semantic analysis |
--input
: the path of the input file including papers' titles. e.g. papers_titles.csv
--output
: the path of the output file including papers' BiTex. e.g. references.bib
--mode
[alternative]: mode you want to open the output file, default is w
.
a
- Append - Opens a file for appending, creates the file if it does not exist.
w
- Write - Opens a file for writing, creates the file if it does not exist.
--style
[alternative]: style of the BibTeX, default is 1
.
0
: standard
1
: condensed
2
: more condensed (delete string between 'DBLP:' and the 2nd '/' after that)
python titles2bibtex.py --input papers_titles.csv --output references.bib
A .bib
file with uniform BibTex information of the papers.
For example: references.bib
@inproceedings{DBLP:conf/icse/TabassumMFCS20,
author = {Sadia Tabassum and
Leandro L. Minku and
Danyi Feng and
George G. Cabral and
Liyan Song},
title = {An investigation of cross-project learning in online just-in-time
software defect prediction},
booktitle = {{ICSE}},
pages = {554--565},
publisher = {{ACM}},
year = {2020}
}
@inproceedings{DBLP:conf/sigsoft/BadihiA0R20,
author = {Sahar Badihi and
Faridah Akinotcho and
Yi Li and
Julia Rubin},
title = {ARDiff: scaling program equivalence checking via iterative abstraction
and refinement of common code},
booktitle = {{ESEC/SIGSOFT} {FSE}},
pages = {13--24},
publisher = {{ACM}},
year = {2020}
}
@inproceedings{DBLP:conf/kbse/WangLZX19,
author = {Min Wang and
Zeqi Lin and
Yanzhen Zou and
Bing Xie},
title = {CoRA: Decomposing and Describing Tangled Code Changes for Reviewer},
booktitle = {{ASE}},
pages = {1050--1061},
publisher = {{IEEE}},
year = {2019}
}
@inproceedings{DBLP:conf/kbse/Jiang19,
author = {Shuyao Jiang},
title = {Boosting Neural Commit Message Generation with Code Semantic Analysis},
booktitle = {{ASE}},
pages = {1280--1282},
publisher = {{IEEE}},
year = {2019}
}
合作撰写论文中,多人可能需要引用同一篇参考文献,而不同人对同一篇文献导出的 BibTeX 信息不一致,这样会造成对这篇文献有两个引用。为了统一BibTeX信息,基于 dblp 强大的平台,通过标题索引的方式,基本能够对计算机领域的论文导出一致的 BibTeX 信息,从而实现了 论文标题集合
到 统一格式的 BibTeX 集合
的转换。
带有 Title 标题的 csv 文件(该文件可以通过文献管理软件进行导出)
例如:papers_titles.csv (文件中必须有一列的标题为Title
)
Title | |||
---|---|---|---|
An investigation of cross-project learning in online just-in-time software defect prediction | |||
ARDiff: scaling program equivalence checking via iterative abstraction and refinement of common code | |||
CoRA: decomposing and describing tangled code changes for reviewer | |||
Boosting neural commit message generation with code semantic analysis |
--input
: 包含有论文标题集合的CSV输入文件路径. 例如: papers_titles.csv
--output
: 包含有BiTex信息的导出文件. 例如: references.bib
--mode
(可选): 对于输出文件是重新写入,还是继续添加,默认是 w
即重新开始.
a
- 继续添加 - 打开/创建该文件并在已有基础上继续添加.
w
- 重新写入 - 打开/创建该文件重新写入.
--style
(可选): BibTeX 的风格, 默认是 1
.
0
: 标准
1
: 精简
2
: 再精简(删除DBLP
和其后第二个/
中间的字符串以精简CiteKey)
pip install lxml
pip install requests
pip install beautifulsoup4
pip install pandas
python titles2bibtex.py --input papers_titles.csv --output references.bib
带有全部 BibTeX 信息的 bib 文件
例如:references.bib
@inproceedings{DBLP:conf/icse/TabassumMFCS20,
author = {Sadia Tabassum and
Leandro L. Minku and
Danyi Feng and
George G. Cabral and
Liyan Song},
title = {An investigation of cross-project learning in online just-in-time
software defect prediction},
booktitle = {{ICSE}},
pages = {554--565},
publisher = {{ACM}},
year = {2020}
}
@inproceedings{DBLP:conf/sigsoft/BadihiA0R20,
author = {Sahar Badihi and
Faridah Akinotcho and
Yi Li and
Julia Rubin},
title = {ARDiff: scaling program equivalence checking via iterative abstraction
and refinement of common code},
booktitle = {{ESEC/SIGSOFT} {FSE}},
pages = {13--24},
publisher = {{ACM}},
year = {2020}
}
@inproceedings{DBLP:conf/kbse/WangLZX19,
author = {Min Wang and
Zeqi Lin and
Yanzhen Zou and
Bing Xie},
title = {CoRA: Decomposing and Describing Tangled Code Changes for Reviewer},
booktitle = {{ASE}},
pages = {1050--1061},
publisher = {{IEEE}},
year = {2019}
}
@inproceedings{DBLP:conf/kbse/Jiang19,
author = {Shuyao Jiang},
title = {Boosting Neural Commit Message Generation with Code Semantic Analysis},
booktitle = {{ASE}},
pages = {1280--1282},
publisher = {{IEEE}},
year = {2019}
}