If textures are fetched using tex1D(), tex2D(), or tex3D() rather than tex1Dfetch(), the hardware provides other capabilities that might be useful for some applications, such as image processing. (See Table 1.)
Feature | Use | Caveat |
---|---|---|
Filtering | Fast, low-precision interpolation between texels | Valid only if the texture reference returns floating-point data |
Normalized texture coordinates | Resolution-independent coding | |
Addressing modes | Automatic handling of boundary cases¹ |
Within a kernel call, the texture cache is not kept coherent with respect to global memory writes, so texture fetches from addresses that have been written via global stores in the same kernel call return undefined data. That is, a thread can safely read a memory location via texture if the location has been updated by a previous kernel call or memory copy, but not if it has been previously updated by the same thread or another thread within the same kernel call. This is relevant only when fetching from linear or pitch-linear memory because a kernel cannot write to CUDA arrays.