The most straightforward way is to have a 2-port SRAM as the memory storage, in which each port can operate under different clocks.
Another way is to take advantage of SRAM based synchronous FIFO. See diagram below.
We use a small flop based async FIFO to do the clock domain crossing, and the majority of the memory storage is implemented in SRAM based sync FIFO. These two FIFOs will use valid-ready protocol for data transfer.
There’s a sizing requirement for the small async FIFO shown above. Let’s assume rd_clk domain can pop data from async FIFO every rd_clk cycle whenever data is available, the goal is, wr_clk domain can push data into async FIFO every wr_clk cycle.
Remember, there is a “round-trip latency” from wr_ptr increments in wr_clk domain, new wr_ptr synchronized to rd_clk domain, data popped from FIFO, to rd_ptr increments in rd_clk domain, new rd_ptr synchronized to wr_clk domain, wr_clk sees an entry available. We want to make sure the async FIFO depth can cover the round-trip latency, to avoid seeing “bubbles” in wr_clk domain and introducing bandwidth degradation. Otherwise wr_clk domain cannot push data every wr_clk cycle. The depth of the async fifo must be >= (“round-trip latency” / wr_clk cycle).