Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual-core issues #1139

Closed
stnolting opened this issue Jan 3, 2025 · 4 comments
Closed

Dual-core issues #1139

stnolting opened this issue Jan 3, 2025 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed HW Hardware-related SW Software-related

Comments

@stnolting
Copy link
Owner

While playing with the all-new dual-core configuration, I stumbled upon some issues that should be fixed:

  • Inter-core communication using the shared main memory is incredible slow due to the caches that require flushing and reloading. Even worse, you always have to keep cache coherency in mind.
  • Implementing spinlocks (or more complex mechanisms like semaphores and mutexes) is tricky. We have atomic load-reservate and store-conditional operations (Zalrsc ISA extension). However, only a single reservation set is supported. 🤔 As far as I understand, this means there can only be a single variable with guaranteed atomic access. This makes implementing atomic memory accesses (for synchronization) hard.

Some Ideas

  • Inter-core communication: We could add some queue-like FIFOs for passing messages between the cores. These FIFOs could be located in the (uncached) IO address space which would be easy to implemented. Alternatively, those queues could be mapped to CPU-internal control and status registers. This is harder to implement but would provide minimal latency.
  • Atomic accesses: I don't like the load-reservate/store-conditional concept anymore. 😅 It appears to be quite useless if there is only a single reservation set. We could add more of them but that would make the hardware really complex (potentially increasing the critical path of the bus system). So maybe we should drop that and go for the full AMO ISA extension? Those read-modify-write would require even more complex logic right inside the bus system that also might degrade clock performance… But it would be RISC-V-compatible and we could use all the nice GCC building… Another alternative would be adding some kind of hardware spinlocks like in the RP2040. 🤔
@stnolting stnolting added enhancement New feature or request help wanted Extra attention is needed HW Hardware-related SW Software-related labels Jan 3, 2025
@stnolting
Copy link
Owner Author

Those annoying reservation-set operations are about to be replaced by "actual" atomic read-modify-write meory operations in #1141.

However, I am still not sure what would be the best solution for inter-CPU communication? A global un-cached scratch pad RAM? Some global uncached FIFO queues? Or a direct queue connection between the two cores? Any idea? 🤔

@stnolting
Copy link
Owner Author

#1142 adds inter-core communication via CSR-mapped message queues (FIFOs).

@NikLeberg
Copy link
Collaborator

Things move fast in the neorv32 world 😄!
Do you have a guess about the improved performance the latest changes gave to the system? I still had no chance to run in myself.

@stnolting
Copy link
Owner Author

Things move fast in the neorv32 world 😄!

😄

Do you have a guess about the improved performance the latest changes gave to the system?

Unfortunately not, but the dual-core configuration is faster than I expected. However, I'm still not quite sure how best to benchmark the performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed HW Hardware-related SW Software-related
Projects
None yet
Development

No branches or pull requests

2 participants