The user should not perceive the delay between its action (motor) and the sensory response from the system. The max delay is about 100 ms (depends on which direction)(25 fps <=> 40ms)
The "natural" is what you have learnt in the real life ... Even if we once achieve to reach the same level of complexity (and some think it is not possible), one may wonder if we really need to reach the perfection to achieve efficiency in regards to the expected result.
To make a difference from the set "screen, keyboard and mouse" which match with the Human Computer Interface field ...
A Behavioural Interface is an apparatus that involve a human behaviour, natural and without (or with short) learning period.
All the senses may be taken into account, not all are needed for all the applications
Other definitions often mix purpose, what it is and what it is for (applications). Moreover, one should not define VR by the tools it uses (Head mounted display, dataglove ...)
Not these all 4 points will be perfectly done in all VR application, but the 4 should just appear if we want to be able to talk about a VR project.