Skip to content

Commit

Permalink
Add parsing of .pl, .cz, .be and .jp domains
Browse files Browse the repository at this point in the history
- Add parsing from more registry soruces
- Bump Tokenizer library for bug fixes
- Update library license location
  • Loading branch information
flipbit committed May 27, 2019
1 parent 3f9d614 commit 02211e0
Show file tree
Hide file tree
Showing 39 changed files with 797 additions and 306 deletions.
70 changes: 70 additions & 0 deletions Whois.Tests/ParsingTests.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Resources;
using NUnit.Framework;
using Tokens;
using Whois.Models;
using Whois.Visitors;

namespace Whois
{
[TestFixture]
public class ParsingTests
{
[Test]
public void TestParseSampleDomains()
{
var results = new List<SampleParseResult>();

var sampleFileNames = Directory.EnumerateFiles(@"../../../Samples/Domains", "*.txt");

foreach (var sampleFileName in sampleFileNames)
{
var fullFileName = Path.Join(Directory.GetCurrentDirectory(), sampleFileName);

var result = new SampleParseResult
{
Contents = File.ReadAllText(fullFileName),
FullFileName = fullFileName
};

results.Add(result);
}

var visitor = new PatternExtractorVisitor();

foreach (var result in results)
{
result.ContentParsed = visitor.Parse(result.Contents);
}

foreach (var result in results.Where(r => r.ContentParsed.Success == true).OrderBy(r => r.ContentParsed.BestMatch?.Tokens.Matches.Count))
{
Console.WriteLine(result.DomainName);
}
}

private class SampleParseResult
{
public string FullFileName { get; set; }

public string DomainName
{
get
{
var fileName = Path.GetFileName(FullFileName);

return Path
.GetFileNameWithoutExtension(fileName)
.ToLowerInvariant();
}
}

public string Contents { get; set; }

public TokenMatcherResult<ParsedWhoisResponse> ContentParsed { get; set; }
}
}
}
12 changes: 12 additions & 0 deletions Whois.Tests/Samples/Domains/025bbs.cn.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Domain Name: 025bbs.cn
ROID: 20180313s10001s99456578-cn
Domain Status: ok
Registrant ID: hc1250473063700
Registrant: 南京越之彬网络科技有限公司
Registrant Contact Email: [email protected]
Sponsoring Registrar: 阿里云计算有限公司(万网)
Name Server: dns27.hichina.com
Name Server: dns28.hichina.com
Registration Time: 2018-03-13 21:45:16
Expiration Time: 2021-03-13 21:45:16
DNSSEC: unsigned
38 changes: 38 additions & 0 deletions Whois.Tests/Samples/Domains/08.pl.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
DOMAIN NAME: 08.pl

registrant type: organization

nameservers: dns111.ovh.net.

ns111.ovh.net.

created: 2004.02.07 06:45:12

last modified: 2019.02.01 18:05:52

renewal date: 2020.02.06 13:00:00

option created: 2017.01.20 04:34:55

option expiration date: 2020.01.20 04:34:55

dnssec: Signed

DS: 46726 8 2
F2A6D8AE119F40330E2C562813320D1AC008A14B65A9D8D1DB40D4AE214977FF

REGISTRAR:

OVH SAS

2 Rue Kellermann

59100 Roubaix

Francja/France

+48.717500200

https://www.ovh.pl/abuse/

WHOIS database responses and Registrant data available at: https://dns.pl/en/whois
58 changes: 0 additions & 58 deletions Whois.Tests/Samples/Domains/_facebook.com.txt

This file was deleted.

29 changes: 13 additions & 16 deletions Whois.Tests/Samples/Domains/amazon.co.jp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,19 @@
[ use 'whois -h whois.jprs.jp help'. To suppress Japanese output, add'/e' ]
[ at the end of command, e.g. 'whois -h whois.jprs.jp xxx/e'. ]

Domain Information: [ドメイン情報]
a. [ドメイン名] AMAZON.CO.JP
e. [そしきめい] あまぞん・いんく
f. [組織名] アマゾン・インク
Domain Information:
a. [Domain Name] AMAZON.CO.JP
g. [Organization] Amazon, Inc.
k. [組織種別] 外国会社
l. [Organization Type] Foreign Company
m. [登録担当者] JC076JP
n. [技術連絡担当者] IK4644JP
p. [ネームサーバ] ns1.p31.dynect.net
p. [ネームサーバ] ns2.p31.dynect.net
p. [ネームサーバ] pdns1.ultradns.net
p. [ネームサーバ] pdns6.ultradns.co.uk
s. [署名鍵]
[状態] Connected (2018/11/30)
[登録年月日] 2002/11/21
[接続年月日] 2002/11/21
[最終更新] 2017/12/01 01:02:51 (JST)
m. [Administrative Contact] JC076JP
n. [Technical Contact] IK4644JP
p. [Name Server] ns1.p31.dynect.net
p. [Name Server] ns2.p31.dynect.net
p. [Name Server] pdns1.ultradns.net
p. [Name Server] pdns6.ultradns.co.uk
s. [Signing Key]
[State] Connected (2019/11/30)
[Registered Date] 2002/11/21
[Connected Date] 2002/11/21
[Last Update] 2018/12/01 01:01:57 (JST)

28 changes: 11 additions & 17 deletions Whois.Tests/Samples/Domains/ameblo.jp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@
[ use 'whois -h whois.jprs.jp help'. To suppress Japanese output, add'/e' ]
[ at the end of command, e.g. 'whois -h whois.jprs.jp xxx/e'. ]

Domain Information: [ドメイン情報]
Domain Information:
[Domain Name] AMEBLO.JP

[登録者名] 株式会社サイバーエージェント
[Registrant] CyberAgent, Inc.

[Name Server] a1-5.akam.net
Expand All @@ -15,25 +14,20 @@ Domain Information: [ドメイン情報]
[Name Server] a4-64.akam.net
[Name Server] a6-65.akam.net
[Name Server] a7-66.akam.net
[Signing Key]
[Signing Key]

[登録年月日] 2004/07/30
[有効期限] 2019/07/31
[状態] Active
[最終更新] 2018/08/01 01:05:09 (JST)
[Created on] 2004/07/30
[Expires on] 2019/07/31
[Status] Active
[Last Updated] 2018/08/01 01:05:09 (JST)

Contact Information: [公開連絡窓口]
[名前] 株式会社サイバーエージェント
Contact Information:
[Name] CyberAgent, Inc.
[Email] [email protected]
[Web Page]
[郵便番号] 150-0044
[住所] 東京都渋谷区
円山町 19-1
渋谷プライムプラザ2F
[Web Page]
[Postal code] 150-0044
[Postal Address] Shibuya-ku
19-1 Maruyamacho
Shibuya Prime Plaza 2F
[電話番号] 03-5459-6150
[FAX番号] 03-5784-7070

[Phone] 03-5459-6150
[Fax] 03-5784-7070
63 changes: 63 additions & 0 deletions Whois.Tests/Samples/Domains/apache.org.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Domain name: apache.org
Registry Domain ID: D706686-LROR
Registrar WHOIS Server: whois.namecheap.com
Registrar URL: http://www.namecheap.com
Updated Date: 2018-09-25T13:18:21.00Z
Creation Date: 1995-04-11T04:00:00.00Z
Registrar Registration Expiration Date: 2022-04-12T04:00:00.00Z
Registrar: NAMECHEAP INC
Registrar IANA ID: 1068
Registrar Abuse Contact Email: [email protected]
Registrar Abuse Contact Phone: +1.6613102107
Reseller: NAMECHEAP INC
Domain Status: clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited
Domain Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
Registry Registrant ID:
Registrant Name: Apache DNS
Registrant Organization: The Apache Software Foundation
Registrant Street: 401 Edgewater Place Suite 600
Registrant City: Wakefield
Registrant State/Province: MA
Registrant Postal Code: 01880
Registrant Country: US
Registrant Phone: +1.1234567890
Registrant Phone Ext:
Registrant Fax: +1.7816238460
Registrant Fax Ext:
Registrant Email: [email protected]
Registry Admin ID:
Admin Name: Apache DNS
Admin Organization: The Apache Software Foundation
Admin Street: 401 Edgewater Place Suite 600
Admin City: Wakefield
Admin State/Province: MA
Admin Postal Code: 01880
Admin Country: US
Admin Phone: +1.1234567890
Admin Phone Ext:
Admin Fax: +1.7816238460
Admin Fax Ext:
Admin Email: [email protected]
Registry Tech ID:
Tech Name: Apache DNS
Tech Organization: The Apache Software Foundation
Tech Street: 401 Edgewater Place Suite 600
Tech City: Wakefield
Tech State/Province: MA
Tech Postal Code: 01880
Tech Country: US
Tech Phone: +1.1234567890
Tech Phone Ext:
Tech Fax: +1.7816238460
Tech Fax Ext:
Tech Email: [email protected]
Name Server: ns2.surfnet.nl
Name Server: ns3.no-ip.com
Name Server: ns2.no-ip.com
Name Server: ns1.no-ip.com
Name Server: ns4.no-ip.com
DNSSEC: unsigned
URL of the ICANN WHOIS Data Problem Reporting System: http://wdprs.internic.net/
>>> Last update of WHOIS database: 2019-05-25T07:08:56.94Z <<<

For more information on Whois status codes, please visit https://icann.org/epp
12 changes: 0 additions & 12 deletions Whois.Tests/Samples/Domains/bbb.org.txt

This file was deleted.

25 changes: 11 additions & 14 deletions Whois.Tests/Samples/Domains/biglobe.ne.jp.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,17 @@
[ use 'whois -h whois.jprs.jp help'. To suppress Japanese output, add'/e' ]
[ at the end of command, e.g. 'whois -h whois.jprs.jp xxx/e'. ]

Domain Information: [ドメイン情報]
a. [ドメイン名] BIGLOBE.NE.JP
b. [ねっとわーくさーびすめい] びっぐろーぶさーびす
c. [ネットワークサービス名] BIGLOBEサービス
Domain Information:
a. [Domain Name] BIGLOBE.NE.JP
d. [Network Service Name] BIGLOBE Services
k. [組織種別] ネットワークサービス
l. [Organization Type] Network Service
m. [登録担当者] HT40618JP
n. [技術連絡担当者] HT40618JP
p. [ネームサーバ] ns02.mesh.ad.jp
p. [ネームサーバ] ns03.mesh.ad.jp
s. [署名鍵]
[状態] Connected (2018/12/31)
[登録年月日] 1996/12/17
[接続年月日] 1997/01/07
[最終更新] 2018/06/06 10:56:24 (JST)
m. [Administrative Contact] HT40618JP
n. [Technical Contact] HT40618JP
p. [Name Server] ns02.mesh.ad.jp
p. [Name Server] ns03.mesh.ad.jp
s. [Signing Key]
[State] Connected (2019/12/31)
[Registered Date] 1996/12/17
[Connected Date] 1997/01/07
[Last Update] 2019/01/01 01:02:11 (JST)

Loading

0 comments on commit 02211e0

Please sign in to comment.