What is claimed is:1. A method, comprising:accessing initial object prediction information associated with an image, wherein the initial object prediction information includes a plurality of initial predictions associated with a plurality of objects in the image, including bounding box information associated with the plurality of objects;presenting the image and at least a portion of the initial object prediction information to be displayed;receiving adjusted object prediction information pertaining to at least some of the plurality of objects, wherein the adjusted object prediction information is obtained from a user input made via a user interface configured for a user to make annotation adjustments to at least some of the initial object prediction information; wherein the user interface is configured to allow the user to:select a bounding box associated with an object and directly adjust the bounding box using the user interface; orchange a type associated with the object using the user interface; andoutputting updated object prediction information, wherein the updated object prediction information is based at least in part on the adjusted object prediction information; wherein:the image and bounding boxes associated with a subset of objects in the image are displayed;the subset of objects have corresponding confidence levels that at least meet a prespecified confidence level threshold;a corresponding confidence level of an object in the subset of objects indicates how confident a machine learning model is in making a prediction associated with the object, the prediction including a bounding box of the object;the image and a hint associated with an object in the image are displayed, the hint being a visual indication of an object that is more compact than a bounding box associated with the object; andthe user interface is further configured to, in response to the hint being selected, display the bounding box associated with the object.2. The method of claim 1, wherein the initial object prediction information is generated by the machine learning (ML) model.3. The method of claim 1, wherein the initial object prediction information further includes classification information associated with the plurality of objects in the image.4. The method of claim 1, wherein the user interface is configured to, in response to a cursor being placed over or near a predicted bounding box or an object, provide an editing interface for the predicted bounding box.5. The method of claim 1, wherein the user interface is configured to, in response to a cursor being placed over an object or a bounding box that is hidden, display the bounding box associated with the object.6. The method of claim 1, wherein:the user interface is configured to make bounding boxes of N objects closest to a cursor visible and hide remaining bounding boxes of other objects, wherein N is a prespecified natural number.7. The method of claim 1, wherein:the image and a hint associated with an object in the image are displayed; andthe user interface is further configured to:in response to the hint being selected, display a bounding box associated with the object and allow the user to make an adjustment to the bounding box; andin response to receiving the user input, update prediction information associated with the bounding box.8. The method of claim 1, wherein:the image and an initial hint associated with an object in the image are displayed; andthe user interface is further configured to:in response to the initial hint being selected, display a bounding box associated with the object and allow the user to make an adjustment to the bounding box;in response to receiving the user input, update prediction information associated with the bounding box; andreplace the bounding box with an updated hint that is distinct from the initial hint.9. A system, comprising:one or more processors configured to:access initial object prediction information associated with an image, wherein the initial object prediction information includes a plurality of initial predictions associated with a plurality of objects in the image, including bounding box information associated with the plurality of objects;present the image and at least a portion of the initial object prediction information to be displayed;receive adjusted object prediction information pertaining to at least some of the plurality of objects, wherein the adjusted object prediction information is obtained from a user input made via a user interface configured for a user to make annotation adjustments to at least some of the initial object prediction information; wherein the user interface is configured to allow the user to:select a bounding box associated with an object and directly adjust the bounding box using the user interface; orchange a type associated with the object using the user interface; andoutput updated object prediction information, wherein the updated object prediction information is based at least in part on the adjusted object prediction information; wherein:the image and bounding boxes associated with a subset of objects in the image are displayed;the subset of objects have corresponding confidence levels that at least meet a prespecified confidence level threshold;a corresponding confidence level of an object in the subset of objects indicates how confident a machine learning model is in making a prediction associated with the object, the prediction including a bounding box of the object;the image and a hint associated with an object in the image are displayed, the hint being a visual indication of an object that is more compact than a bounding box associated with the object; andthe user interface is further configured to, in response to the hint being selected, display the bounding box associated with the object; andone or more memories coupled to the one or more processors and configured to provide the one or more processors with instructions.10. The system of claim 9, wherein the initial object prediction information is generated by the machine learning (ML) model.11. The system of claim 9, wherein the initial object prediction information further includes classification information associated with the plurality of objects in the image.12. The system of claim 9, wherein the user interface is configured to, in response to a cursor being placed over or near a predicted bounding box or an object, provide an editing interface for the predicted bounding box.13. The system of claim 9, wherein the user interface is configured to, in response to a cursor being placed over an object or a bounding box that is hidden, display the bounding box associated with the object.14. The system of claim 9, wherein:the user interface is configured to make bounding boxes of N objects closest to a cursor visible and hide remaining bounding boxes of other objects, wherein N is a prespecified natural number.15. The system of claim 9, wherein:the image and a hint associated with an object in the image are displayed; andthe user interface is further configured to:in response to the hint being selected, display a bounding box associated with the object and allow the user to make an adjustment to the bounding box; andin response to receiving the user input, update prediction information associated with the bounding box.16. The system of claim 9, wherein:the image and an initial hint associated with an object in the image are displayed; andthe user interface is further configured to:in response to the initial hint being selected, display a bounding box associated with the object and allow the user to make an adjustment to the bounding box;in response to receiving the user input, update prediction information associated with the bounding box; andreplace the bounding box with an updated hint that is distinct from the initial hint.17. A computer program product for image annotation, the computer program product being embodied in a tangible non-transitory computer readable storage medium and comprising computer instructions for:accessing initial object prediction information associated with an image, wherein the initial object prediction information includes a plurality of initial predictions associated with a plurality of objects in the image, including bounding box information associated with the plurality of objects;presenting the image and at least a portion of the initial object prediction information to be displayed;receiving adjusted object prediction information pertaining to at least some of the plurality of objects, wherein the adjusted object prediction information is obtained from a user input made via a user interface configured for a user to make annotation adjustments to at least some of the initial object prediction information, wherein the user interface is configured to allow the user to:select a bounding box associated with an object and directly adjust the bounding box using the user interface; orchange a type associated with the object using the user interface; andoutputting updated object prediction information, wherein the updated object prediction information is based at least in part on the adjusted object prediction information; wherein:the image and bounding boxes associated with a subset of objects in the image are displayed;the subset of objects have corresponding confidence levels that at least meet a prespecified confidence level threshold;a corresponding confidence level of an object in the subset of objects indicates how confident a machine learning model is in making a prediction associated with the object, the prediction including a bounding box of the object;the image and a hint associated with an object in the image are displayed, the hint being a visual indication of an object that is more compact than a bounding box associated with the object; andthe user interface is further configured to, in response to the hint being selected, display the bounding box associated with the object.