idresearch Functions
Function for idresearch.
-
idresearch.doc_stats(raw_abs)
Abstract StatisticsFinds most frequently used keywords in the abstract.
-
Parameters
raw_abs (str) – string. Raw text of any article abstract. -
Returns
dictionary. Shows the word and number of occurences in the abstract. -
Return type
freq_ans (dict)
-
-
idresearch.export_reco_csv(url)
Export Recommended Articles dataframeExports a csv of all the recommended articles (url, abstract, year, publisher, citation_count)
- Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported.
- Exports:
new_df (csv): csv. Exports a csv containing information on all recommended articles. Filename: recolist.csv
- Parameters
-
idresearch.get_doc(doc_url)
Get-responseObtaining responses from semanticscholar api.
-
Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported. -
Returns
dictionary. Contains metadata of the main article query doc_paperId (str): string. Contains semanticscholar paperId reco_fox (dict): dictionary. Contains metadata of the recommended papers using SemanticScholar AI -
Return type
doc_fox (dict)
-
-
idresearch.get_ner(raw_abs)
Get Name Entity RecognitionObtains Name Entity type: ORG and PRODUCT from the abstract.
-
Parameters
raw_abs (str) – string. Raw text of any article abstract. -
Returns
list. Gives string type output within a list with ORG type entities. prod_ent (list): list. Gives string type output within a list with PRODUCT type entities. -
Return type
org_ent (list)
-
-
idresearch.get_reco_df(url)
Get Recommended Articles dataframeObtains a dataframe of all the recommended articles (url, abstract, year, publisher, citation_count)
-
Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported. -
Returns
dataframe. Dataframe contatinig information on all recommended articles. -
Return type
new_df (dataframe)
-
-
idresearch.main_abstract(url)
Main Paper’s AbstractObtains article abstract for the queried main paper
-
Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported. -
Returns
string. Abstract of the main queried paper. -
Return type
main_abs (str)
-
-
idresearch.plot_CitationCount_df(new_df)
Plot Number of Papers vs Citation CountPlots a Number of Papers vs Citation Count Year histogram.
-
Parameters
new_df (dataframe) – Pandas dataframe exported using get_reco_df function. -
Returns
Returns a matplotlib plot. -
Return type
plot
-
-
idresearch.plot_CitationCount_url(url)
Plot Number of Papers vs Citation CountPlots a Number of Papers vs Citation Count histogram.
-
Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported. -
Returns
Returns a matplotlib plot. -
Return type
plot
-
-
idresearch.plot_YearTrend_df(new_df)
Plot Number of Papers vs Publication YearPlots a Number of Papers vs Publication Year histogram.
-
Parameters
new_df (dataframe) – Pandas dataframe exported using get_reco_df function. -
Returns
Returns a matplotlib plot. -
Return type
plot
-
-
idresearch.plot_YearTrend_url(url)
Plot Number of Papers vs Publication YearPlots a Number of Papers vs Publication Year histogram.
-
Parameters
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported. -
Returns
Returns a matplotlib plot. -
Return type
plot
-
-
idresearch.reco_abstract(i, url)
Recommended Paper’s AbstractObtains article abstract for the queried recommended paper
-
Parameters
-
i (int) – integer. Denotes the index number as seen in the output from get_reco_df or export_reco_csv functions
-
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported.
-
-
Returns
string. Abstract of the queried recommended paper. -
Return type
reco_abs (str)
-
-
idresearch.reco_authors(url, num)
Get list of authorsProvides the list of authors and the number of times an author’s paper has been recommended in decending order.
-
Parameters
-
url (str) – string. URL from semanticsscholar, arxiv, aclweb, acm, biorxiv are supported.
-
num (int) – integer. Number of author names in the output.
-
-
Returns
dictionary. Returns a dictionary with the Author name and number of times the author’s article has been recommended. -
Return type
occurence_common (dict)
-
-
idresearch.summarize_doc(raw_abs, n)
Summarize the abstract documentSummarizes the abstract by assigning weights to each sentence (based on common words and length of sentences).
-
Parameters
-
raw_abs (str) – string. Raw text of any article abstract.
-
n (int) – integer. Number of lines for the summary.
-
-
Returns
string. Summary of the abstract in ‘n’ number of lines, based on the arguement. -
Return type
summary (str)
-