In the following I use "thread" mostly to refer to the software thing(wp), and "core" (virtual or physical) for the hardware thing(wp). You could say that cores (virtual or physical) are a way to implement threads, and kernel multithreading is another way to implement threads, at a different layer. I will look at layers of abstractions starting from your program (the most virtual) to the CPU cores (the most physical).
- Your program sees it can request any amount of threads for itself, so when it has something to do which feels like a separate "line of reasoning", it can start a thread for that.
- The entity showing infinite available threads to the program is the kernel's scheduler, part of the operating system. Itself, it sees as many cores as the CPU has, including the virtual ones, and will generally attempt to keep each core busy with at least one thread (if its current policy is trying to maximize performance) but when there are more thread than cores, it will have to multiplex(wp) more program threads into the same CPU core, and it does that by running one thread for a short amount of time, then another thread for a short amount of time, etc, all on the same core.
- These cores that the CPU offers can partly be physically separate cores on the same CPU (each being almost a CPU of its own, having just some parts in common with the rest), but partly also they can be virtualized (called trademarked things like Hyper-threading(wp)), which is done when an individual physical core knows that it could be juggling multiple CPU instructions at the same time(wp), but the bit of thread logic it's currently executing doesn't lend itself to that, so to try and keep itself busy anyway, it can juggle instructions from a different thread at the same time.
Now going backwards:
- Since physical cores are the same thing (in the sense that they are used in the same way) as virtual cores from the kernel's point of view, yet their performance profile is quite different, it can be important for the CPU to inform the kernel about these differences, so that the kernel's scheduler can react by adopting a policy that works best for the task at hand given this additional knowledge about the CPU's structure.
- Similarly, even though the your program "sees" an unlimited amount of threads at its disposal, using a ton of them "just because it can" may not be the best of ideas, because when many threads are running on the same physical processor core, there will be more and more overhead associated with switching(wp) among them. So, it can be important for the kernel to tell the program what the situation actually looks like physically, so that the program can decide how to use threads, and also give hints to the kernel on how to plan its thread-to-core mapping, like when setting affinity(wp).
The takeaway is that when you see things by going from virtual to physical, you look at what abstractions are provided to you at each given layer; when you go back from physical to virtual, you look at which policies you want to implement, based on what you want to achieve and on any knowledge of the upper physical layer that is given to you.
"Best" can mean disparate things, though: you may want to maximize performance (typical in a game), but you may also want to minimize energy use (typical on mobile devices) instead, and these goals can be contradictory, so there is space to have very different scheduling policies(wp) depending on what a program needs to do, what the user wants from it, and what device it's running on.