In my program, sieve.cpp I used char *, std::bitset, and std::vector as various containers and ran it.
It did not take long before the char* was dumped as its very profligate with memory. So I tried std::bitset, that choked at pow(2,32) so I was annoyed as usual.
So finally I tried std::vector<bool> which is the only construct available that can work with a cross off lise greater than pow(2,32)
So the sieve has the problem of RAM consumption, using a single bit is as good as it gets.
Moving to parallel improved the performance suggesting the CPU cache was helping.
The std::array is no better as its not as specialized as std::vector yet. I was also very disappointed over the limitations of the std::bitset which should have used size_t for its indexes.
Any programs offered on my site are 64-bit only. 32-bit need not apply.