As noted above, the orchestrator 130 may utilize historical information to determine selections for a current cycle. For example, the historical information may include previous orchestrations in respective previous cycles. Accordingly, the orchestrator 130 may be configured with a reinforcement learning model in which previous orchestrations performed by the orchestrator 130 may be utilized to learn and train the model such that subsequent orchestrations by the orchestrator 130 may generate more accurate orchestrations for desired changes to the world to be realized.
The orchestrator 130 may be configured to orchestrate the smart devices 110 that comprise the world or device ecosystem. Accordingly, the smart devices 110 may be entirely within a given ecosystem, may be from external devices (e.g., multiple ecosystems), etc. The orchestrator 130 may thereby determine a selection of which of the smart devices 110 are to take action according to, for example, a request to affect the world, for which an exemplary implementation is described below. The orchestrator 130 may perform the selection from the available set of smart devices 110 by picking at least one of these smart devices 110 to satisfy the request. In determining the selection, the device orchestration system 100 configures the orchestrator 130 with a posterior orchestration framework.