In at least one embodiment, any one of clusters 3614A-3614N of processing array 3612 can process data that will be written to any of memory units 3624A-3624N within parallel processor memory 3622. In at least one embodiment, memory crossbar 3616 can be configured to transfer an output of each cluster 3614A-3614N to any partition unit 3620A-3620N or to another cluster 3614A-3614N, which can perform additional processing operations on an output. In at least one embodiment, each cluster 3614A-3614N can communicate with memory interface 3618 through memory crossbar 3616 to read from or write to various external memory devices. In at least one embodiment, memory crossbar 3616 has a connection to memory interface 3618 to communicate with I/O unit 3604, as well as a connection to a local instance of parallel processor memory 3622, enabling processing units within different clusters 3614A-3614N to communicate with system memory or other memory that is not local to parallel processing unit 3602. In at least one embodiment, memory crossbar 3616 can use virtual channels to separate traffic streams between clusters 3614A-3614N and partition units 3620A-3620N.