-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems cloning repository submodules #1180
Comments
Looks like github is having a bad moment (again). When your connection is flacky or github has server issues, cloning submodules tends to fail with above message. Nothing we can fix and the only "solution" is to try fetching submodules until it works. Our repo is kinda prone to it as we use lots of submodules. For submodules that have only been fetched partially afaik you can dir into that folder and manually fetch there? |
Is there some way to "try fetching submodules" short of cleaning the entire tree and starting from scratch? I can't spend 20-30 minutes on every failure. 'git submodule update' does nothing in the failed submodules, as best I can tell - nor does it complain. The submodule structure is messed up after the clone, but git does not seem to be aware that it's messed up, which is heinous. |
I see you edited that while I was typing. Will attempt that but really, git? |
git (and github) being so flaky is the reason I personally only use submodules where I have to. And our sample's repo uses so many submodules that it's very prone to failure. I run into the exact same problem your report several times a week and it's always frustrating. |
If you only need to clone for building the documentation, we should find a way to make the Asciidoc part build without having to clone the whole repo. @gpx1000 any ideas on how to achieve that? Building docs doesn't require any submodule, so if they could be skipped things would probably be a lot easier for Jon. |
That would be great as it drops the repo cloning time from substantial fractions of an hour to seconds (and frees up 3 GB on my SSD). However, doing the 'cmake -H"." -B"build/unix" -DVKB_GENERATE_ANTORA_SITE=ON' then throws a lot of errors regarding missing CMakeLists files and targets and such, so it sounds like some substantial work on the cmake configuration would be needed. |
Agreed re submodules, they are the spawn of Satan. |
In other projects we did split the compiler project and documentation setup/build processes. I think having a separate cmake/make file or even something simpler for the samples repo is a workable solution. |
Have you tried git submodule update --init --recursive submodules are a part of git and are rather quite mature and stable when used correctly. |
I'll come up with a method of building docs without needing any further projects downloaded but this one... Gimme time to think it through. |
Maybe something similar to what I did with the tutorial: https://github.com/KhronosGroup/Vulkan-Tutorial/tree/main/antora? A separate documentation makefile with some python script to do the heavy lifting (which might not be required for this repo). |
Well if I can, I'd rather prefer to use CMake; and not pollute the project with other build systems. However, I do like what you did with Vulkan-Tutorial. |
I am just following the recipe in the README. Git isn't responsible for network issues (arguably, at least), but I don't think it's on the user that after failing to download the submodules and throwing a bunch of error messages, git leaves the failed submodules in a broken state with no advice as to how to fix them. Since this appears to be happening with some regularity to one of the repository admins as well, adding further advice to the README about how to recover from it seems useful. |
git config --global http.lowSpeedLimit 0 # Disable low speed limit I think this might stem from lower bandwidth. The above might help. Please report back if issue persists. Once we know what the solution is, we can update the README as appropriate. |
Currently I'm getting clone speeds of 30-50 KiB/s from github - and 25 MiB/s from Khronos gitlab. Curious if you're seeing anything like that. Maybe github is just extremely congested. It does not make a difference whether I'm cloning Vulkan-Samples or another github repo, does not make a difference whether submodules or not , same dismal download performance. |
The problem with testing is that it takes ca. half an hour of this sluggish performance before the fatal errors occur. If I crank up the lowSpeedTime timeout then it's very possible I would be waiting lowSpeedTime seconds, or until github rebooted their servers, whichever comes first. @outofcontrol are you seeing this kind of github speed throttling going on at your end? E.g. 'git clone --recurse-submodules [email protected]:KhronosGroup/Vulkan-Samples.git' getting delivered performance in the range of 30-50 KiB/s when it should be hundreds of times that. |
I'm getting between 12 and 18MiB/s running: time git clone --recurse-submodules [email protected]:KhronosGroup/Vulkan-Samples.git yields the following results: I do see people reporting that the GitHub CLI is faster for them and others posit that this is caused by negotiating the security and that using a personal access token might improve your speed. |
Running locally with:
Total size is 3.4G which ~ 40MiB/s if I am not mistaken? |
I'm not sure what the "GitHub CLI" is? I'm using the command-line git client. Having a hard time imagining what sort of authentication might cause a 500x slowdown. |
I tried using 'gh clone' and am getting (edit: far worse behavior) than plain 'git'. It took 30 minutes to complete and every single submodule failed to download, though the main repo did. 'gh auth status' says
Possibly the behavior is related to the ISP's network configuration. I tried switching from AT&T's nameserver to the Google 8.8.8.8 but continue to see some of the submodules download at glacial speeds and fail (either way, most download at very reasonable 150-400 Mbps rates - it is only a couple of submodules in each attempt, and not the same ones each time, either). I have seen suggestions of going through a VPN to avoid ISP routing issues and I might try Sonic's self-hosted VPN service, not having a commercial VPN. Not a solution that would be generally useful if it were to work. |
It is possible that a VPN showing you're coming from another country will do the trick? Obviously this isn't something that we can correct with a README or documentation. However, maybe? |
Is it possible there are some transient issues in a router somewhere, which might be discoverable with |
I don't know what to make of it but there is certainly a lot of packet loss reported at sites in the middle by mtr - though not at the destination. OTOH if I run mtr against gitlab.com, where I have had zero problems recently, the mtr results look very similar, several sites in the middle with '???' hostname and 100% packet loss reported. If you can suggest a line of approach to get this past an L1 AT&T CSR / "AI" chatbot to someone who might actually be able to do something with their network, I'm all ears. |
Possibly slightly related, for some months now I've been having periodic problems loading github.com PR / Issue pages in Chromium and it is going on with a vengeance at the moment. Reloading in the tab does nothing, opening a new tab with the same URL sometimes works after 3 or 4 tabs. Firefox has no problems at all. It sort of sounds like the same problem with packet loss corrupting sessions although Chromium reports no errors, just sits there and spins forever. |
Might be worth pinging the OrgTechEmail when doing a whois on 192.205.32.182? Amazingly enough, I've had replies from doing a similar outreach in past years. A VPN might reroute you around the problem servers? |
Is OrgTechEmail something from the DNS record? Not familiar with the term. |
In the WhoIs look up, there are several fields for contact, OrgTechEmail is the Organization Technical Email contact for that 192.205.0.0/16. |
They never replied, and attempts to pursue this through AT&T's "customer service" work about as well as you might expect (including initial denials that AT&T had either an internal network, or network engineers responsible for maintaining it). Now they want to replace my modem :-( Running through Sonic's OpenVPN service, downloads are robust - they run at about 1/8th the performance I see from github w/o VPN (when it's not stalled), but at least that's better than running at 1/1000th that performance when it is stalled, and the VPN setup is not failing due to timeouts. Supposedly WireGuard-based VPNs come a lot closer to the connection capacity but that's not an option here. I think the takeaway here is that AT&T has a problem they are unwilling to diagnose - so my only options seem to be to run a higher performance VPN, or switch ISPs. Fortunately I think Sonic may have finally rolled fiber in my neighborhood, as I've been waiting for since 2018, so that may be the right option. It seems advisable to make a note of this in the README since at least @SaschaWillems appears to be suffering similar problems - I would be curious if your using a VPN also improves the situation when you're suffering this behavior. |
due to ISP-related networking issues and a possible workaround using a VPN service. Closes #1180
git clone [email protected]:KhronosGroup/Vulkan-Samples.git
cd Vulkan-Samples
perl -i -p -e 's|https://(.*?)/|git@\1:|g' .gitmodules
git submodule sync
git submodule update I was able to get it working by using ssh instead had troubles with CTPL, fixed it with: git submodule deinit -f third_party/CTPL
git submodule update --init |
@SupinePandora43 thanks! This does seem to help, for me, but there are issues with ssh protocol that may make it hard to just drop into the repository as the default behavior - see #1206. If you know more about the tradeoffs and could comment, that would be welcome. |
Update: the issue is to have the full repository clone, with submodules, work without submodule fatal errors. @gpx1000 volunteered to look at this, although they are not running into this on their own home network which may make reproduction difficult. However, both @SaschaWillems and myself are running into it frequently. Desired outcome is first, change the repository so this does not happen; or if that's not possible, give better advice in the README as to how to recover when it does happen.
I spun off the issue of how to build just the Vulkan-Samples component of docs.vulkan.org as #1181 as that's orthogonal to the submodule cloning problem.
When I try
as described in the README, I'm getting errors of the following form in the submodule cloning:
I'll grant this could just be something about my ISP (AT&T Fiber, haven't had any issues in the last 10 months that weren't just temporary outages). But this operation feels very fragile - are there options that can help? If I then go into the clone and
Then this does not complain. But the submodules that failed seem to just contain a .git directory and nothing else afterwards. Then
gets a small distance into the build and starts failing with
Here the CTPL directory just contains .git and nothing I've tried with submodule changes that situation. I can pull the underlying submodule's repository OK independently of the samples repo submodule setup - but not convince git that the submodule itself needs to be updated / replaced.
It would be really helpful if the README described these sorts of scenarios and advised how to work around them as this is far outside my minimal knowledge of submodules. I've tried removing Vulkan-Samples and re-cloning - the same sorts of failures (but not the same specific submodules) keep happening. It takes 20-30 minutes for a complete cycle with all the retries to complete and it's not easy to test.
The text was updated successfully, but these errors were encountered: