-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auxiliary datasets for the BHSSW and SW databases of elliptic curves #5041
Comments
We also have an auxiliary database of Riemann zeros. I wonder if we should have some index of these somewhere? I was actually trying to find the Riemann zero table recently (to show someone), and it took a while even knowing that it was there. You need to go to one of the L-function pages, click on "ζ zeros" in the properties box, ask for a bunch of zeros and then you'll get a link to the actual download files. |
This data has been added to the subdirectory /bhkssw_ecdb on grace as has the director/stein_watkins_ecdb. Both directories need to be copied over to prodweb1 and prodweb2, can you take care of that @edgarcosta ? |
FYI, I have not yet taken care of this. Hopefully tomorrow.
…On Fri, Feb 18, 2022 at 5:09 PM Andrew Sutherland ***@***.***> wrote:
This data has been added to the subdirectory /bhkssw_ecdb on grace as has
the director/stein_watkins_ecdb. Both directories need to be copied over to
prodweb1 and prodweb2, can you take care of that @edgarcosta
<https://github.com/edgarcosta> ?
—
Reply to this email directly, view it on GitHub
<#5041 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACO2BU67BNJDG6UUAXNFHTU327QVANCNFSM5OV3C2BA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks (and no rush, there still isn't a UI for it and that will need to be tested on beta first). |
For placement, maybe we want something in the sidebar? And maybe also at the top of relevant webpages (like EC data should be displayed at the top of the EC page)? |
I like making an index page linked to from the sidebar, perhaps with the name "Datasets." Here are two possibilities for where to put it:
|
Both are good options, I think. I guess I kinda like the Universe and don't think of Datasets as Introduction, so would opt for the first? |
We already have a directory for these kinds of datasets. |
Interesting; do we just want to use the apache file browser, or make a custom index page? Some things we could do with a custom index page:
Anything I'm missing? |
Yeah, we need to add more information to, it but would be nice to have these organized instead of being a "dropbox". |
Just noting that the page https://beta.lmfdb.org/EllipticCurve/Q/CongruentNumbers is linked to on all ECQ pages under "Learn more", and what is being proposed here for this dataset can/should be similar, even though the files are larger. |
I've put up a rough draft of an index page at https://purple.lmfdb.xyz/datasets; comments welcome. Some questions:
|
Thanks, @roed314 -- I like the index page for these datasets. Yes, the bread for each of the sub-index pages we have should go back here, let's do that in the same PR which adds this page. We also need to think about where there will be links to this index page -- somewhere in the sidebar? I don't see the file ECQ.txt. There is ec-data-S6.rank.txt in /scratch, belonging to @edgarcosta (and I don't know what that is either). The SW data can be got as optional spkgs for sage. See https://doc.sagemath.org/html/en/reference/spkg/database_stein_watkins_mini.html#spkg-database-stein-watkins-mini and https://doc.sagemath.org/html/en/reference/spkg/database_stein_watkins.html#spkg-database-stein-watkins but do not just use the data there, those packages date back to 2007 (the second one had an update in 2011). I have versions of these which would be better to use, so I will try to get these onto legendre and work out what is there, as it is some time since I did. I'll let someone else find and do similar for BHKSSW. I'm sure it is much larger and may make SW obsolete. |
Currently there's a link at the bottom of the sidebar, based on my discussion with @jvoight earlier on this issue. I'll update the bread on the sub-index pages. I wasn't sure what the ECQ.txt file was (it's currently visible here), so I haven't added it to the index yet. Maybe it is the BHKSSW database? I looked at the end of the file and the last conductor size didn't seem right for that, but maybe they're not ordered that way. |
|
I should also point out there is other data that all the servers have access to:
|
Thanks, I just found that folder too. In particular, it includes bhkssw and stein_watkins, so that answers my question from earlier. |
Do we want to make any of the other data in that folder visible on the datasets index page?
I'm inclined to not add any of these for now, but if someone feels inspired feel free to work on one of them! |
I've pushed an initial index page for the BHKSSW dataset, but
@AndrewVSutherland The Because of the gunicorn timeout, to get this actually working we'll need to either address #6221, find a workaround for static files, or split the BHKSSW files into smaller pieces (which will probably require reworking the download table at the bottom of the page). |
I think we should split the files (I originally received them that way, and I converted them from sqllite to plain text). |
BHKSSW won't make SW obsolete, it is organized by height while SW is organized by conductor, and the datasets are largely disjoint. There is a future dataset that may make SW obsolete (I'm working on constructing a database of elliptic curves of bounded discriminant that goes well past the SW bounds, the file ECQ.txt was a preliminary version), but it is not ready to be published yet and I think there is value to having SW available for historical reasons regardless. |
This is very old so let me try to remember:
|
@tornaria I don't have an easy way to give you access to the folder (it's 34GB), but the directory structure is
|
Alright, the new index page is in decent shape, and there are subpages for Stein-Watkins and BHKSSW. I need to wait a couple days for @AndrewVSutherland to get access to the column order for Stein-Watkins (and will probably need to break up the BHKSSW files into smaller pieces, perhaps 2700 files of size 10^7), but in the meantime let me know if you have feedback. Some additional questions:
|
I'd assume that we should use the same license as the rest of the LMFDB as in #5088, unless the data asks for a stronger license? With so few datasets, it doesn't seem to matter too much how they are sorted. I might sort by date submitted, but wait, do we have that information? |
I think using the same license makes a lot of sense. I asked about it because the page includes a section soliciting datasets; we may want to be explicit there that if someone suggests adding a dataset to the LMFDB then they should be okay licensing it as CC-BY-SA (for example). I don't know that we have information on date submitted, though with only 5 it's probably possible to find that information in emails. |
The Rathbun data on congruent numbers was first given to me in 2015. Is that the date we want? It was rather later that I got around to putting it into LMFDB with an index page etc. |
Can we have some sort of checklist on this Issue so that we known when we can close it? |
@roed314 nice job on the pages for databases, and particularly happy to see the BHKSSW database easily accessible (so much easier to deal with .txt files than sqlite files!) On that note though, is there a way to download a range of txt files? Thanks! Also, perhaps we can convert each txt file (or combine them) to a Magma readable format? I did that once back in the day when I played with the database for a paper, so I could try doing that again if that would be useful. |
I split up the files into smaller pieces because of our limitations on file size. So the way to download a range of files is to use a loop on the client side. If we can figure out how to remove our file size limitation (which is difficult), we could do various things to make accessing this data more convenient. As for magma downloads, it should be feasible to construct those on the fly from the text data (rather than storing duplicate data files on the server). |
As for @JohnCremona's suggestion of a checklist, I'm not sure what remains to be done on this issue. Maybe we should close it and @alozanoroble can open a new one with suggested improvements. |
OK, let's close it. |
The authors of Databases of elliptic curves ordered by height and distributions of Selmer groups and ranks have kindly given us permission to make their database of 238,764,310 elliptic curves over Q of naive height up to 2.7*10^10 available as an auxiliary dataset that can be downloaded form the LMFDB, similar to what we do for the database of class groups of imaginary quadratic fields.
EDIT: The authors of A database of elliptic curves -- A first report have also given us permission to host their database of approximately 150 million elliptic curves with absolute discriminant at most 10^12 and conductor at most 10^8 or prime conductor at most 10^12. Given the reasonably modest sizes of these files (gigabytes not terabytes), we can just put them in the /data directory rather than in separate storage buckets.
The text was updated successfully, but these errors were encountered: